Arangopipe, a Tool for Machine Learning Meta-Data Management

Tracking #: 690-1670


Responsible editor: 

Brian Davis

Submission Type: 

Resource Paper

Abstract: 

Experimenting with different models, documenting results and findings, and repeating these tasks are day-to-day activities for machine learning engineers and data scientists. There is a need to keep control of the machine learning pipeline and its metadata. This allows users to iterate quickly through experiments and retrieve key findings and observations from historical activity. This is the need that Arangopipe serves. Arangopipe is an open-source tool that provides a data model that captures the essential components of any machine learning lifecycle. Arangopipe provides an application programming interface that permits machine learning engineers to record the details of the salient steps in building their machine learning models. The components of the data model and an overview of the application programming interface are provided. Illustrative examples of basic and advanced machine learning workflows are provided. Arangopipe is not only useful for users involved in developing machine learning models but also useful for users deploying and maintaining them.

Manuscript: 

Tags: 

  • Under Review

Data repository URLs: 

The data and code associated with this submission are available at:

https://github.com/arangoml/arangopipe

Date of Submission: 

Friday, March 26, 2021