Back to Blog Home

How To Store Large Binary Files In Git Repos Using Git LFS

Jay Mishra| Git | 7 months, 1 week



 

Although Git has become the de-factor standard for version controlling code, but storing and managing large binary files in Git has been a pain and bottleneck so far. The reason for this is that git allows every dveleoper to have full change log of the files, so different versions of large binary files are also stored locally, making the repos quickly balloon in size. So if you have to clone the master or if you have to fetch the changes, it will take a very large amount of time due to the large repo size. 

 

But there are quite a few third party implementations which makes it easier to manage large binary files in git repos. But it is Git Large File Storage or Git LFS which really stands out and is quickly becoming a standard. 

 

Git Large File Storage (LFS)

If you are aware about the inner workings of Git, then you would know that git stores its data in objects called blobs. Blobs are actually short form of Binary Large Objects and it is git’s way of storing content of a file. Now blobs work fine for text based files but when the binary files are concerned, they are not very efficient.

 

So what Git LFS does here is that the instead of creating large blobs of binary files in a git repo, only a pointer file is written. The actual blobs of the binary files are written in in a separate server from the repo server, using Git LFS HTTP API. Now this Git LFS server can be configured to be anything such as AWS S3 or your own server somewhere. So lets say your repo has a video, a psd file, or large data set, then Git LFS will replace them with text pointers while the actual files can be stored on different server. This keeps the size of repo in check. 

 

Git LFS is opensource and is licensed under MIT licensed and has ready made binaries for Mac, FreeBSD, Linux, Windows. Git LFS was originally written in Go language. 

 

So you can very well see that Git LFS has great benefits such as :

  1. Keeping your Git repo size in check 
  2. Leveraging the same Git workflow you are already familiar with
  3. Easy way to version large files 
  4. Lightening fast repo cloning and fetching
  5. No need to change permissions and access controls on repos. 

 

How to get started with Git LFS:

Getting started with Git LFS is super simple.

1. Git command line extension

git lfs install  

2. Select the file extensions you want Git LFS to manage :

git lfs track "*.MP4"

Make sure .gitattributes is tracked

git add .gitattributes

3. Just go ahead and follow your regular git workflow now:

git add new.MP4
git commit -m "New Video File"
git push origin master

 

 



Join 1000+ People Who Subscribe to Weekly Blog Updates

Back to Blog Home