Introduction

For the past two years I’ve been working at ECS as a member of the DATA team. Our primary job is to develop and maintain RESTful APIs for the school. Our APIs are starting to be used all over OSU, from the search page to the campus map.
I joined ECS the end of my sophomore year. It was very exciting for me because this was my first job as a software developer. I had had some small jobs before, but this was the first time I was being paid to code!
I have learned a lot at ECS about programming and software engineering in practice. Classes do not come close to approaching what a real job is like. They do not come close to preparing you for working on a real code base. Even the capstone series doesn’t really prepare you.
Titus Winters defines software engineering as programming integrated over time. Over the course of my two years at ECS I feel like I’ve gotten a taste of what real software engineering is like.
Although I am not quite finished with my degree, I’ve decided that it is time to move on.
I’m grateful for all the time and connections I made with my coworkers.
I worked with the incredibly talented José Cedeno, who has since left OSU. I worked with fellow students Jared and Tso-Liang, who later went on to become full-time employees. I worked with Takumi and Shujin. We recently welcomed two new students: Ian and Ian, and I’m confident that they will carry on the DATA mantle.

What we accomplished

When I started at ECS, we had two public apis: 
  • DirectorySearch for LDAP directory entries using first / last name, email or username. 
  • LocationsSearch for OSU locations including buildings, dining locations (with open hours) and extension offices.
Now we have ten, plus a few more that aren’t public, and plans for even more. Of course, I can’t take credit for most of those–that goes to the rest of the team–but it does show that our team is growing fast. We are also responsible for maintaining all those APIs. 
In the time that I’ve worked here, we’ve gone from a mostly manual deployment process to a completely automated one.
When a pull request is merged into master in one of our repositories, it automatically kicks off a build in Jenkins. Jenkins builds the project, runs the unit tests, and if the tests succeed, it deploys to our development server. Once that’s done, our integration tests run against the live API. If everything seems stable, we tell Jenkins to kick off a build of the production copy, which goes through the same process and deploys to the production server. The API servers are all run inside docker containers for increased security.
This process is called continuous delivery, and I played an integral part in building the infrastructure for it.
One of the highlights of the year was participating in the OSU IT Hackathon. 
The first year that the hackathon happened, I had classes all day, so i wasn’t able to contribute much. The second year, I was away on an internship. But finally, this year i had the entire day free and was able to see the project through from start to finish.
Together with Jared, Tso-Liang, AJ, Takumi, and Ian, we built a prototype of an integration between the campus map and the course schedule, showing markers on the map for each building that you have a class in.
We even won the grand prize!
Other things I’m proud of:
  • Together with José, I designed and built an API from scratch for securely storing encrypted data.
  • I worked on 3-legged OAuth integration, which will enable us in the future to build apis that enable access to personal information like, for example, just hypothetically, course registration data.

What I learned

  • I got to play with technologies that I never would have on my own: Jenkins, Vagrant, Docker, Ansible, Groovy, and Gradle, to name a few. If you want to go into DevOps, having experience with these tools is vital. You’ll see them pop up a lot.
  • I experienced what it was like to work with a team over a long period of time, as part of a large organization. 
  • I saw first-hand how a codebase evolves over time, over a period of several years. I saw how we modified our apis to adapt them to changing requirements and new sources of data; how they sometimes failed, and how we learned from those failures to improve them and make them more robust, sometimes re-architecting them in the process. I experienced technical debt, shifting plans, and real software engineering.
  • I was part of a hiring committee, which was an enlightening experience. I’m sure my experience sitting on the other side of the table will help me during my quest for a job later down the road.

The future

Looking forward, it feels like the DATA team is going to expand and expand to become a central part of OSU.
Data and APIs are the future. A lot of data at OSU is tied up in private databases, only exposed through clunky web pages. Good, open, APIs enable developers to create applications that nobody has even thought of yet. They allow people to create mash-ups of services like we did in last year’s hackathon project, which taught Alexa how to answer questions about people and places on campus using our APIs.
One of the thing I had really hoped we would complete before I left is the catalog API. We released a preview of this in 2016, but ran into difficulty getting permission to use the data, so the project was temporarily shelved. Hopefully we will pick it up again soon.

APIs available on our developer portal use Apigee as an API gateway. Apigee proxies our API traffic to handle functionality like authorization, authentication, caching, and load balancing, among other things. Our current license with Apigee allows us 8 million API calls per month. This quota has been able to give us a comfortable amount of headroom since we started using Apigee in 2015, but since the start of this academic year, we’ve been getting too close to our 8 million limit (which is a good problem to have!). We don’t expect API traffic to slow down, so we are changing are plan with Apigee to increase our monthly quota. As apart of this upgrade, Apigee is also upgrading our gateway to use better infrastructure which is hosted on Amazon Web Services. Our new agreement with Apigee means that our monthly quota is increased to 300 million calls per month and we are getting a higher runtime SLA.

One downside to this upgrade is an outage required from Apigee. Apigee has done these types of upgrades to its other customers before, so they envision the outage to last 10-45 minutes. We have scheduled this outage on Thursday, November 2nd, at 7:00AM (please see the update below for the new upgrade date). This will be a total outage of all our APIs for api.oregonstate.edu including the development and test environments.

Apologies for any inconvenience caused from this outage, but thank you for bearing with us as we aim to improve our APIs and adapt to increasing traffic.

Update

As of 10:00AM November 2nd, all APIs are back online. The upgrade was unsuccessful for Apigee and they had to rollback the changes they were performing. This upgrade will be performed at a future date and time. Apigee identified the problem that prevented the upgrade and will be addressing it before the next upgrade. We apologize that the outage lasted longer than expected. We will be working with Apigee to make sure the future upgrade causes less interruption for API traffic.

Apigee is going to try the upgrade again on Thursday, November 9th, at 6:00PM. The problem with the previous upgrade attempt was found to be related to access tokens. Apigee was copying access tokens which was slower than normal, so they decided to abort the upgrade. For the upgrade scheduled on November 9th, Apigee is not going to copy access tokens. We decided that since most access tokens expire in an hour or less, skipping access tokens reduces the outage time and risk associated with this upgrade.

Final Update

Apigee successfully completed the upgrade and all API traffic was returned to service at 6:20PM, November 9th. Thanks for bearing with us!

One of our most popular API’s is the locations API. The locations API is used to get campus buildings, extension campus locations, and dining locations on campus. Since the word “location” can be used to describe many types of places, we actively source new locations to add to the API and discover new data to add to existing locations. While sourcing new locations and data, we work with data stewards to ensure the data we are providing is accurate and true. One example of enhancing existing locations in the API is the recent addition of building geometries to Corvallis campus buildings.

Centroids

Initially, campus buildings in the locations API included a coordinate pair in their data which represents the centroid of a building. This can be useful as an alternative to the building’s address to place a point on a map to represent the building’s location. Better yet, the coordinates can be used to query against by specifying lat and lon query parameters in the URL of a locations API request. Using these parameters queries buildings that are close to the coordinates provided in the URL. Use the distance and distanceUnit query parameters for a more specific query.

Here’s an example of a locations API request that returns all locations that are within 300 yards of the Valley Library:

https://api.oregonstate.edu/v1/locations?lat=44.56507&lon=-123.2761&distance=300&distanceUnit=yd

Geometries

Centroid coordinates are useful for performing actions related to the distance, but what if you want to draw the shape of a building on map? A new dataset we recently added to buildings is geometry coordinates. Geometry coordinates can be used with services like the Google Maps API to draw building shapes on a map. A good open source alternative to the Google Maps API is Leaflet which can also map coordinates from the locations API.

Buildings in the locations API now have a geometry object which follows the GeoJSON specification for a geometry object. Within the geometry object is type and coordinates. Type will either be Polygon or MultiPolygon, depending on the location. Locations that have multiple physical structures will be MultiPolygon (like Magruder Hall) and Polygon is for a location that only has one structure. Most buildings on campus are polygon locations.

Let’s take a closer look at a simple polygon location, Hovland Hall:

"geometry" : {
"type" : "Polygon",
"coordinates" : [ [ [ -123.281543, 44.566486 ], [ -123.281544, 44.56636 ], [ -123.281041, 44.566359 ], [ -123.281041, 44.566485 ], [ -123.281543, 44.566486 ] ] ]
}

Coordinates for a polygon location will be a 3 dimensional array of coordinate pairs, where index [0] of the 3rd level of the array will be longitude and index [1] will be latitude. The 2nd level of the array represents an array of coordinate pairs otherwise known as a ring. The 1st level of the array represents an array of rings. Each ring represents a set of coordinate pairs that, if connected to each other in order, would draw a shape of the building. As a rule of GeoJson, the first and last coordinate pairs in a ring must be identical. The example of Hovland Hall shows that it has five coordinate pairs (with the first and last being identical), which make up one ring within one polygon.

Some buildings on campus have multiple rings (multiple arrays of coordinate pairs). A polygon with multiple rings represents buildings with holes in them, like Cordley Hall. In an array of rings, the first ring represents the exterior structure of a building while any additional rings are holes (interior rings). Moreover, GeoJson specifies the wrap direction of exterior and interior rings. Wrap direction is the direction that a ring is drawn when laying out each coordinate pair on a map in order. The wrap direction of exterior rings is counterclockwise while interior rings are wrapped clockwise. However, it’s worth noting that services like the Google Maps Polygon API only care that exterior and interior rings have opposite wrap direction.

Donut with labels showing the difference between an exterior and interior ring.
Buildings with holes in them are like donuts, where the interior ring represents the hole in the middle. Image Source.
Donut with two holes representing a polygon with two interior rings.
Buildings can have multiple interior rings which represent multiple holes. Image Source.

Since multipolygon locations are locations with multiple structures, their coordinates array adds another dimension to represent an array of polygons. All the same rules apply, except the coordinates array for a multipolygon will be 4 dimensional.

Do you have any ideas for data to add to the locations API? Contact us to share your ideas or visit our developer portal to register an application to try using the locations API: developer.oregonstate.edu

This year our team participated in the second annual Hackathon hosted by the Information Services department. Teams were given around 7 hours to create something before presenting their creations to all the participants and being judged on their work. Awards are given out at the end for categories like simplification, partnership, and learner experience.

Our team set out to create some custom skills for Amazon Alexa – Amazon’s virtual assistant voice service. We wanted Alexa to be able to answer questions about OSU. Our team decided to use the APIs we’ve built as the data source for some of the answers we wanted from Alexa. As apart of our project, we also had to create a new API that would function as anJared presenting at the hackathon intermediary between the Alexa voice service and our APIs that would be providing the data. Amazon allows to either use an AWS Lambda function or HTTPS endpoint to facilitate the interaction between the Alexa service and a backend data source.

Since we opted for the HTTPS option, we had to build our API around the specific JSON schema that Alexa sends and expects to receive. Amazon provides the Alexa Skills Kit to allow developers to create a skill that has a number of intents. A skill always has an invocation name that allows the Alexa to know what skill a person is wanting to use. We decided to use “Benny” as the invocation name for our skill since the questions that Alexa would answer would all be related to OSU. Intents are the types of actions that can be performed within a skill. To trigger an intent we created, we would start by saying “Alexa, ask Benny…”. When an intent is triggered, Alexa sends a request the Alexa API we created during the hackathon. Depending on the intent, our API will call one of our backend APIs to get the data for a response. The API uses the data to create a text response that’s meant to be spoken and returns the response to the Alexa.

Jose working at the hackathonWe used the locations API for several of the intents we created. The data in the locations API allowed us to create intents to answer questions like “what restaurants are open right now?”, “is the library open today?”, and “what resturants are close to me?”.

We used the directory API to create an intent to lookup information about people on campus. We can ask things like “what is the email address for Edward Ray?” and “what is the phone number for Wayne Tinkle?”.

Our team also created intents that used our terms API and class search API. For example, to get a list of open terms, you’d say “Alexa, ask Benny what terms can I register for?”. We also created the PAC (physical activity course) intent. When I was a student, I would often find myself looking for a random 1-2 credit class to take that fit around the rest of my schedule. The PAC classes were nice because I could do fun things like biking, running, or rock climbing. The PAC intent allows you to ask “give me a PAC class for Fall 2017 at 2:00 PM on Mondays”. Alexa will then find a random PAC class that fits into that schedule.

After the hackathon, we created a video to demo some of the intents we created with an Amazon Echo. However, you don’t need an Amazon Echo to develop and test Alexa skills. There are many applications out there that allow you to test an Alexa skill, like EchoSim.

Video Demo: https://media.oregonstate.edu/media/t/0_vqlnak06

Amazon let’s someone beta test any skill they create by linking an Alexa enabled device (like the Echo or EchoSim) to their account. Releasing a skill to be available to any Alexa device requires approval from Amazon. Since the skill we created the hackathon was a proof of concept, we didn’t submit the skill to be available on all Alexa devices, therefore the skill isn’t available to be used publicly.

Centralizing Access Token Requests

The current method to get an access token for an our APIs is to make a POST request containing a client ID and client secret to an API by appending “/token” to the end of the URL. For example, the first URL makes an access token request, and the second url makes an API request to the locations API:
  • POST https://api.oregonstate.edu/v1/locations/token
  • GET https://api.oregonstate.edu/v1/locations
Today, we are announcing the OAuth2 API, which performs OAuth2 related requests and serves as a centralized OAuth2 API. Developers can use the OAuth2 API to request an access token.
  • POST https://api.oregonstate.edu/oauth2/token
The token endpoint for the OAuth2 API allows access token requests for any API. Developers can then use the same access token in the Authorization header of their API request like normal.

Deprecation

Today, we are also deprecating the decentralized “/token” endpoints for our APIs. We plan to remove token endpoints from our APIs on Monday, November 13th 2017. We encourage you to start using the OAuth2 API instead for access token requests. Before the production change on November 13th, we’ll be removing the decentralized token endpoints from our APIs in our development environment on October 30th 2017. 
After Monday, November 13th, 2017, you won’t be able to get an access token by adding “/token” to the end of a request URL. For example, these requests won’t work after that day:
  • POST https://api.oregonstate.edu/v1/directory/token
  • POST https://api.oregonstate.edu/v1/locations/token
Instead, please use the OAuth2 API to get an access token. Link to documentation. 

OAuth2

Oregon State University uses OAuth 2 for API authentication and authorization. When someone registers an application on our developer portal, they get a client ID and client secret which are used during the API request process. To access an API resource, the client ID and secret are used in a token request to the OAuth2 API: POST https://api.oregonstate.edu/oauth2/token

The response for a token request will include an access token, which is used to get access to an API and has a limited lifetime. The response will also include a token expiration time and a list of APIs the access token may be used with. A developer can then use the access token in the header of a request to access an API the token is authorized for. This process works well for public data (like the locations or directory APIs) or when only specific people/departments can use an API.

Three-legged OAuth

Deprecating our decentralized token endpoints from our APIs allows us to direct all access token requests to one API instead of each individual one. This makes things simpler, but also allows us to expand our scope of OAuth2 to more than access token requests. One of the components of OAuth is the three-legged flow which allows an end-user to grant an application permission to access certain data about the user. For example, think about how applications on the web share data with each other. Let’s say a developer created a web form and allows a user to auto-fill information from their Facebook profile. The web form directs the user to Facebook to authorize the web form application to access the user’s data. This is an example of three-legged OAuth.

Enabling three-legged OAuth allows us to expand our scope when developing APIs to deal with more confidential or sensitive data, and lets the users decide on whether an application should access data about them. As an example, think about an API that could retrieve a student’s grades. The developer or the student (user in this example) shouldn’t have access to everyone’s grades. They should only be able to access their own. A student would log in (authenticate) before deciding if the application is allowed to retrieve their grades.

For more information on the OAuth standard, go to https://oauth.net/2/

Register an application on the developer portal to get started using some of OSU’s APIs: https://developer.oregonstate.edu

I’ve been working at Enterprise Computing Services at Oregon State University as a student developer for one year. In our group, we mainly focused on developing Application Programing Interface (API) and other relevant jobs. As a Computer Science major graduate student, real-world working experience gave me a very good opportunity to apply my academic knowledge and concepts to practical situations. Since there are various tools involved in software development, a student developer should learn and be able to use these tools very quickly.

Last summer, I worked on a big project which utilized two open source tools, Ansible and Jenkins, to automate our APIs’ deployment process and implement the concept of continuous integration. Because of this project, I got a chance to demo and present my work in front of lots of audiences including stakeholders. Being able to state the concept clearly to audiences is also a very important skill as a professional software developer. From my experience, these areas are important to any student who wants to become a professional developer.

  • Always practice and enhance your coding skill. Pick one or two programming languages you are familiar the most and keep practicing them. If possible, you can also choose some frameworks (e.g. Django or Flask) to implement your own project. Push them to GitHub so that people can easily view your technical skills and enthusiasm.
  • Writing skill is also important. As time goes by, it is easy to forget details of a project you worked on. Therefore, being able to write clear explanations of a project with well-documented documentation can help you and your teammates save a lot of time.
  • Time management. Manage your time wisely especially when you are a student and also a software developer at the same time. Scheduling your tasks by priorities is always a good idea and it helps you finish your tasks on time and efficiently.
  • Finally, good communication is the most critical skill you definitely need to have. From regular group meetings to large meetings across the departments, you will have many opportunities to demo your work or convince other people. Having a strong communication skill will absolutely help you and your team a lot.

In a nutshell, developing your technical skills and soft skills are equivalently important for a professional software developer. Student or whoever is able to show the above abilities will definitely be a good software developer. Remember that opportunity favors the prepared mind. Hope all future student developers can find a great journey here!

For the past year and a half, I’ve been working in two teams in Information Services at OSU as a student developer. My first year in SIG (Shared Infrastructure Group) gave me an opportunity to work on the open source CoprHD project and regularly collaborate with the community members and the developers from the industry. As the team leader, besides the work assigned to me, I was also responsible for tracking the progress of my team members, assuring the on-time delivery of the features. Giving presentations of our work based on different purposes to the audience from various backgrounds was another important part of this project.

My role in the second team was one of the developers in the team. This team involved more diverse tasks allowing me to grasp many skills in a short period. Similar to the previous work, presentations were also frequent activities. Team members share knowledge through demoing or explaining the work to the team members. On the other hand, I went through several interviews for the past year, which gave me a sense of desired skills in the industry. If I look back to the experiences of interview processes and my job, there are some tips I could share with the students who want to be developers, and some of which I wish I could have done better.

Communication Skills

I know this might sound cliche, but it is one of the most highly effective skills that can help you reach your goal whether to get the work done or the help you need. Specifically, being able to ask questions wisely will bring you the answer you need, reduce avoidable mistakes or get out of the tar pits. In addition, communication over emails is another routine. A clear and concise email not only shows your professionality but also leads up to prompt replies.

Moreover, as described above, presentations are commonplace at work, so being capable of showing your work through the presentations is not just desirable but also necessary. During interviews, it’s very likely to be asked to introduce the projects you have done, while at work it’s even more frequent to demonstrate your work/ideas to your team or even a larger audience. Therefore, capable of delivering your thoughts clearly or even vividly will be a big plus in the job market.

Technical Skills

Technical skills are your core competencies, and it’s very critical to keep strengthening the skills you already have. Further, being able to learn new skills and learn them quickly on demand might be even more valuable. It is very common that you are not familiar some of the techniques used in your project, which can be a programming language, a framework or a tool. Being able to grasp the knowledge and applying them to the project is also a demonstration of your ability to work.

Another suggestion is to be aware of your thought process and try to identify and form the most efficient one for yourself. The thought process, such as how you learn things, how you debug and how you figure out a problem will change over time along with your experience. And an awareness of the changes will help you improve your efficiency, which is also helpful for your interviews since a lot of interviewers will try to identify your thought process from the way you answer their questions.

Additionally, putting down the notes for the work you have done is one of the things that I wish I could have done better. No matter how confident you thought you would remember the work, after leaving it aside for couple months or even weeks, everything will look completely new to you.

In a nutshell, technical skills and the communication skills are the way to becoming a professional developer. I believe if one could continue enhancing the skills discussed above and be able to show them to the employers, he/she will be a strong hire in the job market.

I’ve been working in IT at OSU as a student for the past 3 years, but more recently, I’ve been taking more of the responsibilities of a developer over the course of this year. Growing into the role as a student developer with my job has been well-timed with my degree in Business Information Systems. My undergraduate studies this past academic year as a senior has involved more software architecture and development leading up to my graduation. The similarities between the work I’ve done for my job and for my degree have been complimentary, allowing me to share skills and techniques between the two disciplines.
Taking classes in a business environment has given me a different prospective on software development for my work. Information systems business classes, besides teaching programming, focused on making sure the outcome of software development is successful and addresses the needs of stakeholders. We were taught to focus on the problem trying to be solved, conceptualize a solution in a non-technical way to stakeholders, and develop measures of success to ensure the outcome isn’t a failure. These skills along with my experience as a developer guide some of the advice I have for students who want to be developers:
  • Be able to communicate non-technically when needed. Whether it be a supervisor, customer, or colleague in a different department. Being in a software development role means taking on the work that requires special skills and knowledge unique to you and your team. The ability to propose a better solution to a problem, explain an issue to a stakeholder, or even describing the work being done to someone who isn’t as technically proficient is key. I’ve always believed that being able to teach a topic or skill is a marker of proficiency in that area, and when it comes to software development and IT, being able to communicate something non-technically is a similar marker of proficiency. 
  • Remember the importance of soft skills like verbal communication, demo/presentation skills, and writing skills. According a survey conducting by Technical Councils of North America 70% of employers say soft skills are equally important as technical skills for success in a software development career. My experience in IT and software development has taught me the importance of these soft skills. It has always been beneficial for me to keep up with these skills through practice, whether it’s giving a demo at work or a presentation for a class.
  • Learn and practice technical skills through projects and practical experience. Learning technical skills is very important, but I would advise aspiring developers to practice and maintain their skills through methods that are demonstrable to employers. Being able to show a coding project or talk about projects accomplished during a job or internship might be required during the interview or application process. The knowledge of development skills serves as a foundation, but being able to demonstrate those skills is important for pursuing a career as a developer.
At the end of the day, good technical skills will be at the core of software development. However, getting in to software development as a career can be difficult without much prior experience. I believe demonstrating the skills above can show employers that someone is able to grow into a developer position to further diversify their technical and soft skills.

This article summarizes the current security solutions for  Docker containers. The solutions in this blog post have been discussed and designed by the Docker community. You can also find valuable tips on how to enhance security while running a Docker in a production environment.

Possible Security Issues in a container-based environment

Before we jump into the security solutions, let’s explore some security issues of container-based systems. Generally speaking, there are three types of attack models, which are caused by the vulnerabilities of the container-based systems.

Types of Attacks:

  • Container compromise: result in illegitimate data access and affect control flow of instructions
  • DoS(Deny of Services): disturb normal operation of the host or other container
  • Privilege escalation: obtain a privilege which is not originally granted to the container

Disclosed Vulnerabilities:

  • Namespacing Issues -Docker containers utilize Kernel namespaces to provide a certain level of isolation. However, not all resources are namespaced:
    • UID: Causing “root” user vulnerability
    • Kernel keyring: containers running with a user of the same UID will have access to the same keys if they are handled by kernel keyring
    • Kernel & its modules: Loaded modules become available across all containers and the host
    • Devices: includes disk drives, sound-cards,GPU, etc.
    • System time: The SYSTEM_TIME capability is disabled by default, but if it’s enabled, we will need to worry about it.
  •  Kernel Exploit – Container-based applications share the same host kernel, namely, flaws in  the host kernel might allow malicious containers to escape and gain access over the over whole system.
  • DoS Attacks – Since all containers share kernel resources, if a container or user consumes too much capacity of a certain resource, it will starve out other containers on the host.
  • Container Breakout – Because users are not namespaced, any process that breaks out of the container will have the same privileges on the host as it did in the container. For example, if you were root in the container, you will be root on the host. It’s a typical privilege escalation attack , unlikely to happen, but possible.
  • Poisoned Images – It’s possible for attackers to modify/embed malicious programs into the image and trick users to download such corrupt images
  • Compromising secrets – Applications need credentials to access databases or backend services. An attacker who can get access to these credentials will also have the same access as the application. This problem becomes more acute in a microservice architecture in which containers are constantly stopping and starting.

Current Solutions:

Now Let’s take a look at what security solutions that come with the current Docker implementation and what strategies or techniques can be used in production.

Least Privileges

One of the most important principles to achieve container security is Least Privileges: each process and container should run with the minimum set of access rights and resources it needs to perform its function. This includes the actions to reduce the capabilities of containers:

–  Do not run processes in a container as root to avoid root access from attackers.

–  Run filesystems as read-only so that attackers can not overwrite data or save malicious scripts to file.

–  Cut down the kernel calls that a container can make to reduce the potential attack surface.

–  Limit the resources that a container can use

This Least Privileges approach reduces the possibility that an attacker can access or exploit data or resources via a compromised container.

Internal Security Solutions

Containers can leverage the Linux Namespace and Control group to provide a certain level of isolation and resource limitation.

Namespace

Docker provides process, filesystem, device, IPC and network isolations by using the related namespace.

  • Process Isolation: Docker utilizes PID namespace to separate container processes from the host as well as other containers, so that processes in a container can’t observe or do anything to the other processes running in the host or in other containers.
  • Filesystem Isolation: Use mount namespace to ensure that for each mount space, a container only have impact inside the container.
  • Device Isolation: The container cannot access to any devices unless it’s privileged.
  • IPC Isolation: Utilize IPC namespace to prevent the processes in a container from interfering with those in other containers.
  • Network Isolation: Use network namespace so that each container has its own IP address, IP routing tables, network device, etc.

Control Group

Docker employs Cgroup to control the amount of resources, such as CPU, memory, and disk I/O, that a container can use. Under this control, each container is guaranteed a fair share of the resources but preventing from consuming all of the available resources.

Linux Kernel Security Systems

The kernel security system is present to harden the security of a Linux host system. We can also use them to secure the host from containers.

By default, containers disable a large amount of Linux capabilities from its containers in order to prevent an attacker to damage the host system when a container is compromised. And it also allows configuration of capabilities that a container can use.

Linux Security Module (LSM)

Two most popular LSM will be AppArmor and SELinux:

  • SELinux is a labeling system, that implements Mandatory access control using labels. Every object, such as process, file/directory, network ports, devices, etc, has a label. Rules are put in place to control the access to objects.
  • AppArmor is a security enhancement model to Linux-based on Mandatory Access Control like SELinux. It permits the administrator to load a security profile into each program, which limits the capabilities of the program.

Another Approach

Seccomp

The Linux seccomp (secure computing mode) facility can be used to restrict the system calls that can be made by a process. namely, containers can be locked down to a specified set of system calls.

In Production

When running Docker in a production environment, you will want to leverage one of the security solutions listed above and apply proper precautions to provide a more secure and robust system. There are three major security tips in to keep in mind when running Docker in production.

Segregate Containers by Host

The main reason to place each user on a separate Docker Host is to minimize the loss when container breakout happens. If multiple users are sharing one host, if a user monopolizes all the memory on the host, it will starve out other users. Even worse, if container breakouts happen, a user could possibly gain access to another users’ containers or data through the compromised container.

Therefore, although this approach is less efficient than sharing hosts between users and will result in a higher number of VMs and/or machines than reusing hosts, it’s important for security.

Another similar solution would be separate containers with sensitive  information from less-sensitive ones for the similar reason.

Applying Updates

Just like what is recommended for Windows system, it’s recommended to apply updates regularly. This includes updating base images and dependent images to fix the vulnerabilities in common utilities and framework. At times, we need to update Docker daemon to gain access to new feature, security patches or bug fixes. Removing unsupported drivers is also important, because those could be a security risk since they won’t be receiving the same attention and updates as other parts of Docker.

Image Provenance

To safely use images, you need to have guarantees about their provenance:

  • where they came from
  • who created them
  • ensure you are getting the exactly the image you want

There are three solutions  for image provenance: secure hash, secure signing and verification infrastructure and use Dockerfile properly.

  • Secure Hash:  Secure Hash is like a fingerprint for data. It’s a small string that is unique to given data. If you have a secure hash for some data and the data itself, you can recalculate the hash for the data then compare.  In docker, it’s called docker digest, a SHA-256 hash of a filesystem layer or manifest (a metadata file describing the parts of an image, containing a list of constituent layer identified by digest)
  • Secure Signing and Verification Infrastructure:  Data could be changed / copied if it travels over unsecure channels (e.g. HTTP), so we need to ensure we are publishing and accessing content using secure protocols. Notary project is an ongoing secure signing and verification infrastructure project in docker, which compares a checksum for a downloaded file with the checksum in Notary’s trusted collection for the file source (e.g. docker.com). For more details, please check https://github.com/docker/notary
  • Dockerfile:  Not as we expected, dockerfile is likely to produce different images over time, so as time goes, it’s hard to be sure what is in your images. To use docker properly, you would:
    • Always specify a tag in FROM instruction, and use digest to pull the exactly same image each time
    • Provide version numbers when installing software from package managers. However, since package dependencies can change over time, sometime we need to use tools (e.g. aptly) to take a snapshot of the repository
    • Verify any software or data downloaded from the internet by using checksums or cryptographic signatures.

 

This blog post is a glance of the current security solutions for docker containers, if you are interested, please refer to the reference articles for more details. Are you using Docker in production? Have you implemented some of these security models?

References

[1] Analysis of Docker Security

[2] Docker Security – Using Containers Safely in Production

[3] Docker Doc – Docker Security

Why should you spend your time reading another blog?  Well, if you’re reading this post you’re probably part of, or want to be part of, the OSU developer community. So are we. And we want to grow  our community. To that end, this blog is dedicated to providing useful information to developers on campus.

Developers at OSU come in all sizes and shapes: backend, frontend, Drupal, WordPress, framework and people wanting to learn. They’re working on a variety of languages, frameworks and tools. Given the variety of developers, we plan to provide fresh blog posts at least once a week and subjects will vary depending on the developer author.

These posts aren’t meant to exist in the ether – we hope they’ll spark conversations and help grow our skills while growing our community. In other words, comment and discuss (but keep it civil). And if there’s anything particular you’d like to know about or a post you’d like to write, let us know in the comments below.