The following pages outline what EFS is, how it works, and how to use it from the perspective of an end user, a developer, a systems administrator, and an integrator of the EFS software.
If you’re looking to download the latest version of EFS, click here.
The documentation is divided into chapters specific to tasks within EFS:
This doc outlines what EFS is, its reason for being, some basic concepts and how to get moving with EFS. We assume a basic familiarity with Unix/Linux systems and NFS.
EFS, or Enterprise File System, is a suite of Perl-based commands to help deploy, distribute and maintain versioning of software releases in the UNIX environment. These commands structure your file system to allow ease of deployment and, as a byproduct, standardize your environment. It even abstracts the OS from the end user so that file locations are common across different UNIX flavors.
In addition to the Perl layer, EFS infrastructure also consists of an NFS namespace and a database system (Oracle or MySQL are the two supported flavors as of the time of this writing).
The following scenario is an extremely common one. The organization you’re in has hundreds of Unix servers with dozens of use cases, ranging from C++ development and compile hosts, to Web server farms which use open source products like Apache and Perl, to production hosts that run in-house production software, and the list goes on. So, as a systems administrator, or someone responsible for the systems in some capacity, you must:
Ensure that these hosts have the software they need. Do we have the latest version of Apache on all our hosts? Which hosts have which versions? Wait, does the newly rebuilt host XYZ have all the Perl modules the developers are going to need? Let’s just run half a dozen scripts we’ll write up on the fly to check that out…
Install the versions on all the hosts. If it’s Perl-based, you may have to run repeated CPAN install commands on all the hosts. If it’s C++ or C libraries, you’ll be sure to check that the symlinks are up to date. A few open source packages? Did you make sure you had all the dependencies worked out.. No problem, you say, I’ll just run RPM commands a few times to get all that stuff worked out, and pray that the systems all remain in sync…
… and if it’s another platform altogether, like Solaris, you’d better adapt your automation to account for those particular quirks.
You soon discover that automation in this fashion is a big hog, it’s difficult, and you’re surprised why it hasn’t been standardized since every systems admin in the world runs into this problem.
What if you could just stick all your software on an NFS mount, export that, and run it from there? And then if a new host comes along, there’s no need to spend countless hours logging tickets, checking hosts, writing scripts, and running commands to get your new host up to date. Now what if that NFS mount had a database stuck to it, so updates to that NFS mount could be tracked and controlled… hm…
If you’re the typical application developer, you’ve probably come across one or two familiar scenarios outlined below.
You need to test before going live, but you want to make sure that the system you’re testing on matches production as closely as possible. So you request a particular OS version, a particular patch set for that OS, and then you need to ensure that the particular libraries you depend on are all there, further hoping that they’re the same versions of the ones you’ve compiled against and, not to mention, are going live with.
But that’s difficult, because the release management controls in place in your organization don’t lend themselves to be easily audited and you’re not sure how much faith you can have in what’s currently installed and what libraries or packages might be missing, so you might be tempted to statically compile in those external libraries or just sit and bite your knuckles.
Now your own releases suffer the same deployment uncertainties as the libraries and software mentioned in the previous section: it’s hard to standardize, deploy and rollback in case of a deployment problem.
EFS solves these problems by providing you with a centrally-administered namespace for you and your colleagues to absolutely guarantee that particular versions of releases will be available to:
It is simple to use EFS. In order to get started, you need a Linux server and some storage space. If you want any redundancy, you need 2 Linux servers.
We are in the process of developing a process for installation that will take into account some of the rougher spots of getting EFS working on your system. For now, install the software as you would any Open Source software. Check back here for updates and tools that may help in this process.
We will be constantly updating this page and the installation process as we get users to try the system. Please feel free to send mail through the mailing list and someone will get back to you to address specific issues.
An “EFS Cell” is the building block of the global EFS environment. It is simply the NAS filer (or filers) and the NFS clients that mount the /efs file systems from them, as a single administrative subdomain of the global environment. This cell can have the celltype of either DEV (for development and QA/UAT testing) or PROD (production). There is at most one cell per distinct data center per celltype. A single cell, though, can support multiple data centers.
EFS has 2 main namespaces: “/efs/dev” (read/write) for development, and “/efs/dist” (read only) for distributed code. The DEV celltype supports both /efs/dev and /efs/dist (for UAT and QA testing). The PROD celltype supports only /efs/dist.
A NAS filer containing the EFS data is called an EFS “Cell.” This cell consists of every part of EFS’s infrastructure in a single location, including storage, EFS management servers and caches. The collection of cells within a region is referred to as a “Regional Cell.” When any UNIX server is connected to EFS, they access the EFS Namespace.
Each regional cell may contain the EFS DEV or EFS PROD environments or both.
The DEV environment consists of a filer volume where the source code of the applications/libraries are “housed.” This code is not duplicated to other regional cells or other filers. Because of this, This DEV environment is backed up on a regular basis.
To connect a client to EFS, a simple boot script is installed on the client machine. This script makes the static NFS mount points necessary to make the /efs/dist, and optionally /efs/dev, file systems available on the client.
EFS is built using NAS Filers and the NFS Protocol. The original Morgan Stanley project was based on AFS, but Merrill Lynch was uninterested in this technology. The NAS Filers are abstracted from clients originally by using Acopia file switches, but eventually by using Linux automounter technology for the DEV environment and NAS’ FlexCache technology for PROD.
There are two components to EFS: (1) the File System and (2) the Management Utility. The File System is the EFS namespace - NFS mounted on EFS clients. The Management Utility is the EFS Command Line Utility (CLI) for creating and managing projects in EFS.
EFS platform support includes 32 and 64 bit Solaris 8, 9 and 10; and 32 and 64 bit Linux RHEL 2.4, 2.6 and 5. /efs is a mount point for a platform-specific directory structure that remaps the “exec” names to specific platforms. This indirection means you can build a single release for as many supported platforms as necessary, and provide a single, platform-neutral path to access the platform-specific components. All content in EFS is managed in a 3 level hierarchy: Metaproj[ect], project and release.
The release space in /efs/dev has 3 directories: src, build and install. The src contains the source code for the project. Build is a general work area available for the build cycle, and the install area is the target for the binaries.
/efs/dev/$meta/$proj/$release/src build install/common .exec exec ->
The release space in /efs/dist is a copy of only the contents of the “install” directory from /efs/dev:
/efs/dist/$metaproj/$project /$release/common .exec exec ->
Moving from the DEV environment to the PROD environment is managed using the EFS stage model. There are five basic stages (dev, qa, stable, uat and prod. All releases begin life in the dev stage. The dev, qa and stable stages can only be distributed to the DEV environment, and the uat and prod stages are the only ones which can be distributed to the PROD environment. In order to qualify for the uat or prod stage, a releases must be (a) locked down, (b) have dependencies which are all at least uat or prod, and (c) pass all of the required EFS audits, of which there are several.
EFS is managed from a core Perl Library that contains most application logic. The library is accessed via a simple command line interface (CLI), via a client/server architecture which centralizes all privileged EFS commands to limited set of administrative servers. The basic structure of an EFS command is as follows: “efs action object [arguments]”
Developers and EFS Operations each run different commands to complete tasks in the EFS environment.
We are excited about people using our software and we welcome contributions from our users.
Anyone wishing to contribute code or documentation to the EFS repository will need to:
To be able to contribute code, you must do the following:
Submit the signed document:
When your Contributor Agreement has been received by the EFS team, a team member will notify you whether it has been accepted or rejected due to incorrect data entry (like missing a physical signature), and what the next steps are.
See the child links below to see instructions on actually contributing your code once you have been accepted as a contributor.
Once your Contributor Agreement has been accepted, you can then submit code:
To see the process we use to accept contributions please see our Contribution process.
Thank you for your contribution!
We also have mailing lists you can contribute your experience with EFS and discuss issues or questions you may have. Also, check the child links below for more information and other contribution options.
|EFS Contributor Agreement.doc||35.5 KB|
Code contributions are made as patches attached to tickets. After a ticket with an attached contribution is made, it is reviewed by the EFS project team. Public discussion about contributions is made either on the efs-dev mailing list, or as comments within the ticket. Status changes, like acceptance and rejection, are always made in the ticket system.
Contributors who have provided patches which are included in a release are mentioned in the AUTHORS file inside the release.
If you encounter an error while working with EFS, and don’t understand what is causing it, then create a bug report.
However, if you do know how to fix the problem you encountered, then think about submitting a patch, or getting commit privileges (see below).
Try to keep your patches specific to a single change, and ensure that your change does not break any tests. Do this by running
If there is no test for the fixed bug, please try to provide one.
The preferred method of creating a patch is to use the current head of master in the EFS git repository. This ensures that the patch works with the most recent EFS source code, and makes it easier for the EFS team to apply the patch.
git clone email@example.com:efs-core.git
$ cd efs # ...modify code... $ git commit -a
$ git format-patch origin/master # generates a patch file (e.g. 0001-foo.patch)
Each and every patch is an important contribution to EFS and it’s important that these efforts are recognized. To that end, the Authors file contains an informal list of contributors and their contributions made to EFS. Patch submitters are encouraged to include a new or updated entry for themselves in AUTHORS as part of their patch.The format for entries in AUTHORS is defined at the top of the file.
If you have a new feature to add to EFS, such as a new test:
cd efs-core diff -u /dev/null newfile.t > newfile.patch
Trac creates a ticket for the submission, and you will receive an automatic reply with details of the ticket identifier. This identifier should be used in all further correspondence concerning the submission.
Everyone on the list sees the submission, and can comment on it. An EFS project team member will make sure you have submitted your EFS Contributor Agreement before the patch or feature is accepted. A developer with git commit authority will commit it to git once it is clear that it is the right thing to do.
Even developers with git commit authority stick to this scheme for larger or more complex changes, to allow time for peer review.
A list of all unresolved tickets is available.
You may wish to apply a patch submitted by someone else before the patch is incorporated into the master branch on the git origin.
For single diff patches or svn patches, copy the patch file to efs, and run:
cd efs patch -p0 < some.patch
For recursive diff patches, copy the patch file to workingdir, and run:
cd workingdir patch -p0 < some.patch
In order to be on the safe side run ‘make test’ before actually committing the changes.
Sometimes new files will be created in the configuration and build process of EFS. These files should not show up when checking the distribution with
The list of these ignore files can be set in
cd <efs-core root directory> vim .git/ignore # modify list of ignored files, and save...
git status is now ignoring the correct files, submit a patch as described above.
The http://www.openefs.org website is hosted in a Drupal CMS. Submit changes through the usual ticket interface in Trac.
As we have stated, EFS is an infrastructure. Lots is in the pipe. We are planning to offer already-configured software for the environment, starting with Perl. More will be on the way as we develop this site and the Open Source version of the software.
We have divided the FAQs by the type of work you are doing. Please see the child pages below.
What basic configuration do I need to run EFS?
You need a Linux server with some storage. If you want some disaster recovery, you will need at least 2 Linux servers.
What benefits will I see if I use EFS?
There are 2 main benefits: easy of deployment and reuse. EFS is designed to make deploying your software quick and easy across an enterprise or global structure. This is its main benefit. To make this work, it standardizes the environment. This standardization has a secondary benefit of making reuse easy. Since the code is always in the same place on every machine and is globally available, it makes reuse of even small utilities easy.
Should I even consider using EFS on a stand-alone machine? Are there any benefits?
At this point, no. This is enterprise-level software. Unless you are trying to address deployment issues, it is not the software for you. The current ALPHA version is really for those familiar with AFS and who already know the software from working within the corporate environment where it is already installed.
Why should I use EFS for my development purposes?
what platforms does EFS support?
Currently, EFS is designed to support Linux v. xxx and Solaris on Sparc v. xxx. It is possible that EFS could be modified to work with other UNIX flavors, such as HP, Solaris on Intel, but this has not been done so far.
Does EFS work with Windows?
Because of the structure of Windows, and the basic assumption that all applications are installed locally, EFS does not fit well into the Windows paradigm. However, it is possible to set up a Samba mount and access the read-only information from the PROD environment. This has not currently been tested, nor is this an endorsement.
Does an application need to be specifically designed for EFS?
Not specifically. EFS does not need newly development, but you need to take into account certain aspects. First, the target EFS environment is read-only. That means that any read/write space needs to be taken care of outside of EFS. This read-only attribute allows EFS to allow all users connecting to an EFS Cell to look at the same binary copy.
Second, the software cannot assume certain paths, since they will probably not be available. Also, the area where the software installs (/efs/dev/) is not the same as the location where the software will be run (/efs/dist/). These all need to be taken into consideration as software moves into the EFS space.
Does EFS scale? How large an installation can EFS support?
Yes! This is the point of the software. We have used the software in a corporate environment with hubs (or “EFS Cells”) in New York, London, Hong Kong, Singapore, Korea and soon other Asia/Pacific Rim locations.
Should I use EFS or Subversion?
Both. EFS is not a source-control system, it’s a deployment-control system. Maintain your code in Subversion and deploy your code with EFS.
What license does EFS use?
EFS uses the <href=”http://git.openefs.org/efs-core/tree/source/LICENSE”>Apache 2.0 license.
Our Get Involved page in the Developers’ Zone gives you information on how to contribute to the EFS code base.
Can I write my application data and logs into EFS?
No, EFS is not designed as a general purpose file system. It’s used to store and distribute read-only binaries, libraries, configuration files and other data that is not changing.
For development using GCC from EFS, do I need locally installed libraries/headers?
When using GCC from EFS, please make sure that the machine you’re building on contain a locally installed copy of the glibc and glibc-devel headers. These packages will be called glibc-x.y.z and glibc-devel-x.y.z as well as glibc-common-x.y.z and glibc-headers-x.y.z. On one of our compile servers for example, the following packages are installed. Your versions may vary, depending on the version of the distribution you’re using (RHEL4 vs. RHEL5 for example).
Can I dist only to a specific cell?
Yes. Use the -cells argument and specify the specific cells.