This past week, our team met with the clients at BEL that requested our help in improving their product. The NDAs are signed and we were able to sift through the code to see how we should approach the problem but first, a small introduction to the software we are dealing with.
FLOW (Forward Looking Operations Workflow) is an application that aims to improve productivity and reproducibility in data intensive neuroscience experiments. The software consists of several tools for data analysis but most notably, it utilizes Docker as the container runtime for python workflows written by experimenters for the purpose of sharing and distribution. Reproducibility is one of the cornerstones of scientific rigor and containers are an excellent tool for consistency especially when we are dealing with a multitude of packages and dependencies. This tool, among others are all housed in one product and are accessible through a UI built on top of the FLOW software. In addition, the FLOW software itself utilizes containers to make rolling updates. FLOW is, in a sense, both an IAAS and a SAAS since the company provides server hardware to run the software that they themselves also build.
So where do we come in? After meeting with the clients, we quickly realized that development of FLOW is still on going and its direction in terms of end goals was clearly fluid. It was tempting for us to say we could build several components on top of FLOW but we must be realistic with what we should strive for in the coming weeks. Having said that, our project aims to improve upon FLOW by adding in parameter capabilities to the containerization toolset. Calculations run in the containers may rely on several constants / factors and so far these values must be baked into the containers before being uploaded. By allowing parameterization, container images that are prebuilt by scientists can then be configured to suit another’s particular needs.
This goal comes in two parts. First and foremost, we must address how FLOW manages containers. The backend is written in javascript using express to build the API and is responsible for spinning up new containers on the local machine. Our team aims to insert a logic to store parameters associated with a container image, pull that information accurately when needed, and also maintain default and configured values. The second part is developing a UI to interact with this new feature. The front end is built on react components that call on the same API to perform a given task. This task is simple enough but we still have to exercise caution because information inserted here is stored in the same location (a MongoDB instance) as experiment data.
All in all, we are looking forward to building this feature and hopefully have time to make additional improvements. Software development is always going to be incremental and this project is no exception. Perhaps the next step is to improve on the generation of container images associated with FLOW. But for now this process remains an external manual step.
Leave a Reply