Source Control Software Is Too Intrusive

Yes, I said it.  Read the title again.  Source Code Repositories Are Too Intrusive.  I am speaking in generalizations, but before you leave an angry comment on this post, hear me out.

Regardless of whether you’re using Git, Mercurial, TFS, or some other technology to store your source code files, your solution is intruding on your world.  It may or may not be built into your toolset, and you likely have to be ultra-conscious of what is and is not checked in.

Don’t get me wrong.  I think that being able to branch a project and merge it back later is a phenomenal feature.  I think that changesets are valuable.  What I’m suggesting, is that on a daily basis, I am very AWARE that I’m using source control.  In my opinion (and this entire article is opinion), great software is invisible.

An example of invisible software is Dropbox.  (I am not endorsing Dropbox for source control.)  I never think about it.  I never have to open the Dropbox client.  Yet, my files are persisted to a website where I can get them at any time.  They’re synced to each of my machines.  Never once do I have to “commit” them to one location or another.

Why isn’t our source control software invisible?  I don’t want to think about my source control software until I need it.  A hard drive fails.  I accidentally delete some files.  I absolutely want to be able to get previous versions of those files back.

Why doesn’t source control “check in” my code every time I save instead?  I’m not talking about “committing” to the original branch, mind you.  We’d be breaking the build every 7 seconds.  I think that committing my code back to the shared source repository that my entire team is using is a monumental decision, and should not be taken lightly.  But for my personal “bookshelf” or whatever term you use, I truly believe that each press of the Save button should also be preserved.

We get pulled away from our desks all of the time.  Meetings, lunch, meetings, etc.  If something happens when we’re away, we’ve lost our work.  I don’t know about you, but if I were to lose even an hour’s worth of code, my productivity for the day would be shot.  I have a trigger finger for the “Ctrl + S” command, and I think that our invisible software should recognize that, and persist it somewhere other than my local machine.

What do you think?  Do you like having to remember to check-in every time you’re done with a file?  Would persisting every “save” be beneficial or detrimental?  My vote is for beneficial.

48 thoughts on “Source Control Software Is Too Intrusive

  1. It is a shame that most commenters (all bar two) did not actually properly read or fully understand what the OP was saying.

    Al did understand, and suggested a mod to GIT to stage a local repository that would trigger periodically, irrespective of remote commits which would as usual be user driven.

    I agree with the OP – it would be nice to essentially have a branch on the remote that would automatically be created as soon as I “saved” locally, with “commit” then being a final save to branch followed by a merge. That way, I am achieving several objectives
    – I have an offline copy of my work.
    – I can easily roll back even in my un-checked in code, using well understood source control techniques.
    – I can make my own mini-stages without disrupting other’s code base.

    Now there will be those who argue that all this can be done simply by creating a branch. To which I respond “why should I”? As in why should I have to invoke some special commands to do what in fact should always be done by default?

    Along these lines, I would propose that such an “auto-branch check-in” also create named snapshots of each project whenever the project in question a) has changes and b) compiles successfully.

    The check-in comments (which would be auto-promted for – at these named snapshot stages), could be accumulated to become the check-in comments for the eventual (automatic / invisible) branch merge.

  2. I think you are mixing two pretty different things here: (team) source control and (frequent, incremental) backup. For the first part, any modern source control SW is good. For the second one, if you are using Windows, I would recommend shadow copy (built-in feature) to another machine, it can do exactly what you want: save every change and be able to give you a diff / old version if you need one. If you’re using a different OS, I agree with those above who suggested a daemon/regularly running script to copy to a remote location.

    • @Szaki, I will agree with you on this, that source control (team) and backup (frequent, incremental) is very different, yet conceptually similar sub domains. They should be developed and used separately. Maybe your IDE can do it. Or you OS.
      But turning source control software into a monolithic-do-all is not the right way to do this.
      The best thing about the gnu toolkit is that it has a wide variety of different tools that helps you do a complex job in a variety of ways. The fact that all these functionality is implemented in several different tools does not mean it is less usable, but more useable.
      If you had a specific task that needs to integrate all the tools together well make a wrapper, gui or something integrating all the tools.
      So in conclusion, a new software, implementing this snapshot-ing can be done wrapping around source control or not. I prefer not wrapping around source control.

  3. I agree with the sentiments here. However, I see a hierarchy of commit levels:

    (i) on Save, commit to a user repository associated with the task at hand. Changes here are not visible to other users, but are versioned and retrievable.
    (ii) on completion of task at hand, promote to a higher-level repository.
    (iii) changes made in (ii) may be commited to a higher-level repository, for example integrating changes from multiple teams into a new product release.

    For some large products (e.g. Windows, Linux, …), more than three level may be necessary, as successively larger integrations occur.

    For (i) it would be handy if this occurred automatically, and transparently. For (ii) and (iii) it would be necessarily non-transparent, as the steps affect other developers. For some commits, correct authorisation may be required – e.g. having passed code reviews and or acceptance testing.

  4. This feels like another case of accidental versus essential complexity. There is some essential complexity to managing a codebase that no amount of complaining is gonna make go away. But admittedly, there might be more accidental complexity to SCM than there needs to be.

    identifying the motivation behind a set of changes is an essential complexity which is difficult to make non-intrusive. but the path to making a set of changes could be made simpler.

    IDE-SCM integration has a lot of room for improvement. The IDE has quite a bit of context about what you’re trying to do. For example, the IDE could commit after it looks like a refactor:rename operation went well. perhaps rollback a change when you undo it.

  5. from a StackOverfow question: why not use (on Git) https://github.com/bartman/git-wip which creates dangling branches of your wip, the latest one having a proper wip/name head.

    This means that you would have both a private ctrl+s auto-SCM, and the regular public “commit” style SCM in the same repo, and the regular garbage collection would neatly remove the old dangling ‘wip’ branches that aren’t linked to any public view of your current development.

  6. I never like auto-save for source codes, for Google Docs maybe, but not for files under source control.

    With that regard, the tool we are using is Accurev which is stream-based wherein you can “keep” your private changes and these changes only affect you and not everybody else.

    Once you are ready to share your changes to the bigger group, you “promote” your changes. As the changes are promoted to “higher” streams, then those changes become more visible to a wider audience.

    In our set-up, our promotion goes from workspace->project->dev->build->qa->beta->release. It’s very nice since we customized our stream structure to mimic our development cycle.

    Note: I am only a user of Accurev and in no way affiliated or compensated by them.

  7. Where i work we use clearcase and while it is not a open/free SCM it has many good ideas in it, for example the main subject of tracking changes when i work on on a “project” (in clearcase terms and not in the regular one) there is another step after checkout and checkin that is called deliver which is very similar to the push action of git however this also allows me to do as many checkin and checkout on my own view (which is more like a checkout of a repository in cvs than a clone in git) so i can track my own changes as well as anyone who bothers enough but if i look on the source tree i can limit to seeing only the main branch and the changes on it. however in general i don;t think that checkin everytime i leave the computer is correct. lets assume the checkin is kept somewhere outside the machine and a developer does about 50 saves in an hour (mostly automatic if you are using a good IDE) you get a very large overhead of data sent back and forth to the SCM server

  8. I first read the title and I thought ou were FN nuts. But you have a great idea.

    So long as the die-hards realize he is not talking about committing to the trunk but each owns personal workspace. And the dropbox analogy rules.

  9. In my experience, there is a reason that you have to be so conscious of what is checked in when. The main issue is that you never know who is going to “check out” or update their code base at any given moment. If you check in code that doesn’t compile or for whatever reason isn’t exactly completed yet, then someone else can potentially check that code out and then be stuck with a broken code base.

    The idea behind SCM is that it is a valid codebase, not an incremental backup of your own work in progress. Yes, the SCM will hold multiple versions of the same codebase, but it shouldn’t hold fragments, half-baked ideas, etc. You can use Dropbox (or whatever you want, really) for your own backups and such, but the code in the SCM system should always compile and function as expected.

  10. I agree with the post, as I see many benefits. I do not agree that if this was a requested/popular feature that SCMs would be implimenting it, as it is all on how the SCM is designed for a problem that they were solving. But as more and more work is done beyond source files, but in documents, images, PhotoShop, etc… and the DVCs out there, it giving more potential solutions for a different problem that traditional SCMs were not designed for.

    For instance one, of the projects I have been working with is to use Git with TFS together, where a commit in Git is a commit to TFS, but a local Git: which syncs its changes to a cloud service, checks in a change for each of my saves. To compound on this, Applies OSX Lion has this idea implimented with their applications, and they go one step further with an auto save, you no longer get your Ctrl+S for save but it saves for you, and then you can use their interface to view past changes.

    I am not preaching we should all use Macs, but I am saying that the idea is not new and one potential solution is out there for Apple users. Me being a PC user I have to write my shell plug-ins and monitor changes for a file and then provide a nice way in explorer and other applications to say “Commit these changes”

  11. For Visual Studio users (at least through 2008), there’s the Visual Local History plug-in (http://vlh2005.codeplex.com/). It makes a new backup to a directory every time a file is saved. The directory, number of backups to keep and diff program are all configurable. Its menu option added to VS’s Tool menu allows you to compare the current version to any previously saved version.

    Once installed, it’s invisible until needed.

  12. I agree most people. you confused source control with backup

  13. Jeff,
    Little under a year ago I set about solving a similar problem for myself regarding PowerShell scripts I write. I write scripts on at least 3 different machines. The solution I developed was a PowerShell module that saves files locally but also pushes files to a hosted personal SharePoint site I have online. Files go into a versioned document library. This way I have all of my machines push to that repository. This won’t work for everyone, but I found it quite useful and started a CodePlex project (which sadly has been stale for months) to share it out.

  14. “A hard drive fails. I accidentally delete some files.”

    Those comments certainly seem to suggest an inclination to think of source control as backup.

    Source control and backup are two different things.

    You should have both. They should be different.

    And of course you should be backing up source control as well.

    “Do you like having to remember to check-in every time you’re done with a file?”

    No. Because I don’t check in when I am done with a file.
    I check in when I am done with a bug. Or when I am done with an enhancement request. Or when I have completed a release or at least partial release (sufficient enough to be cohesive by itself.)

    And those process each require other actions. Such as closing the bug, writing a summary (in the bug) about what changed, etc.

  15. You should be able to create a personal branch of your common development branch in all source control systems. And then have a script that commits changes to your personal branch running on file changes. This will save your work, and let others see what you do. But it will not decide for you which changes should go into the common branch, or when you should merge.

  16. A save offsite with a separate checkin would be great, provided the local copy is saved first, and the offsite save is done in an asynchronous way. I don’t want to wait from the time I hit save until the offsite save is done. Gross productivity then networking. 😉

  17. Ensure that you data on disk is safe is not a SCM concern. You have tools like drop box shared network drives etc.

    Once this problem is solved, you may notice that your IDE (mine is eclipse) make an internal version for every save you make. You can go back to any state, compare internal changes. This is fast and conveniant and because you don’t touch SCM, you don’t have to rollback experiment. (I use revert funtion a lot to go back to blanck state) and I always ensure that I only commit relevant changes for each file.

    I think what you require will not really save time, not for me at least, what you save on one side, you lose on another (like removing changes you made but don’t want finally).

  18. I can see where you are coming from with this argument, but I have to disagree. I think the reason that you feel this need (and I don’t) is that our processes differ.

    For instance, I use source control for many things, but, in this specific instance, as a waystation / checkpoint of sorts. Let’s say that I am working on a feature. I take a TDD-like approach though I am never capable of making it to full TDD. Regardless, I write a bit of code, test and make sure it works. Then, I check in saying “Added feature x w/ y” or whatevs. Every time I complete a task – that is a check in, which usually hits ~ every 1 to 2 hours.

    By following this pattern, my source control is always stable and 100% up to date. I never lose work, but – if I did – it was only an hour or so anyway. If I am working at home (say on the weekend), I don’t want to have to figure out where I was at when I left off on a half-implemented solution.

    Finally, I end up w/ a very well documented process of changes that I made over time. When I am trying to sort through a bug, I can literally peel back every change until I find the source. Then I can use that same change log to verify every down-stream feature (in addition to whatever tests I may have in place).

    The Dropbox paradigm is a nice concept for rapidly changing and inherently stable items, such as word documents. However, my “on the fly” coding it too fragile / haphazard for me to trust an automated check-in system. I would much rather be purposeful in it.

  19. TFS has an option to “Check in everything when closing a solution or project.” Problem solved.

    • Yeah, problem solved…right.

    • TFS also has the option, which can’t be unchecked, of not working with anything outside of Windows.

      • Given the overly buggy state of current SvnBridge release as of end-2012 and, worse, the long-standing unwillingness of this now Microsoft-walled project to service several pending cases of active user/developer corrections input, I’d have to agree. I haven’t even bothered to check on usability state of git-tfs (Mono-based) or the Yet Another New Child On The Block, git-tf (Java-based, surprisingly), so far . Conclusion: I’d recommend to stay away from TFS since one will experience issues trying to do suitable development (cross-platform etc.) as easily/powerfully as with widely used solutions (not to mention the PITA upstream project merge issues with TFS as opposed to tree integration of authentic third-party upstream sub projects in e.g. svn / git).

  20. Why do we still have the concept of “saving” source code files?

    I have been using RubyMine for the past 6 months and I LOVE how there is no concept of saving files. You make a change in the editor and BAM it’s magically saved without any extra clicks or button presses on my part.

    I suppose if you wanted to make your SCM repository transparent you could have a job kick off every x minutes to commit any changes. BUT the reason I don’t like that is because when I commit I am communicating intent to myself and the rest of my team. I can look back through my SCM log and get a story of what work was being performed and why.

    However, you could have the automatic check ins go to a different branch, and then when you are ready to make an explicit commit you could merge from that temporary location.

    Dunno… guess it all depends on how far you want to take it. I feel like my current workflow using Mercurial is pretty good. Automatic commits sounds like it might be pushing me into the point of diminishing marginal returns.

  21. If you’re writing a big project with multiple developers, or you need to make absolutely sure that what you’re writing is clean and available to a community, you need an “intrusive SCM”.

    But for what it sounds like you want, a nightly backup snapshot would probably suffice. Write a simple script that gets run nightly and zips up a snapshot of your project. You wouldn’t have to think about it either.

    • What I see Jeff saying is “please save me from myself” — which I would suggest is really different from not having a formal SCM process. If you take my other comment about “check in everything when closing . . . ” you can get away with that if you have your own branch for development. That way it always checks in your latest to your branch, then you can consciously decide to merge that to the DEV branch when you know it is clean.

      Alternatively, you can do what I’ve recommended to our dev team, which is to set up a scheduled script that shelves everything in a given path on a regular basis. Though most of the time it’s being run nightly, there’s not an issue (at least with TFS) of running that shelve script hourly.

      Either way, you get a recoverable history of changes for yourself, and you get formalized SCM when you commit to the shared branch.

  22. I love the “trigger finger on CTRL+S”!!!!!!!!!!!!! haha, had that longer than the CTRL+X, C, V shortcuts! :>

  23. I’m not in favor of automatic check in for source control. Check-in should be a conscious decision on the part of the contributor, and should signify “this code is ready for inclusion in the branch and should build and run without causing issues”. Further, I believe developers should comment their commits so that the reason(s) for it are documented for other contributors to see and understand.

    That said, there’s no reason why a source control can’t also auto-save the latest version of a code file that isn’t ready for commitment, and flag it as a non-committed file that’s open for editing and in progress, but still stored in the repository and safe from the local machine blowing up.

  24. Interesting idea. Couldn’t a DVCS combined with a FileSystemWatcher do this easily?

    Have a FileSystemWatcher watch your project directory (and subs) for changes. On change of any file, emit a “git add -A” command to stage changes in your local project repository. You could choose to auto-commit then locally as well.

    As you said, you’d still want manual control over commiting to the remote (central/build/team) server to not push breaking changes.

  25. Jeff –

    I’ve been using something called “Agile Platform” from OutSystems for about 2 years now, it has this problem solved… every time you deploy to a server (oh, that’s one click from the IDE, solving another huge dev hassle… INCLUDING DB schema changes!), it creates a version on the server. You can download any given version from the server, re-deploy it (including rollbacks), etc. You can also make a bundle with an app, all its dependencies, etc. to get a single-file deployment. Every time I see an article like this that (rightfully) complains about a problem with traditional, mainstream dev, I think of companies like OutSystems (as well as others, like Alpha5) who solved these problems years ago.

    J.Ja

  26. What I see Jeff calling for has nothing to do with committing, but everything to do with making sure that the latest version is always saved, preferably off-site. If this were in place, I’d use it for more than just source control; I’d also use it for documents, presentations and so forth.

  27. There’s an extension for Visual Studio in the online gallery called ‘TFS Auto Shelve’ and it simply makes a shelveset backup of what you’ve done every five minutes.

    Your machine explodes? Pull down the shelveset on another computer and you’ve lost no more than five minutes work.

    Not quite ‘on save’, but pretty close.

    (For those that don’t know, a shelveset in TFS is a ‘group of changes that I’m making’ (really, what version of the files you’ve currently got checked out and what edits you’ve done to them), it’s unique to you and not shared. The history isn’t tracked (just the latest version); and it’s not a checkin so it doesn’t interfer with what others are doing.)

    TFS doesn’t track the history of shelvesets – so it might not be perfect. The ideal might be TFS keeping the last, say, week’s worth of edits to a shelveset.

    • I’ll have to play with the Auto Shelve script. I’d particularly like to be able to get a diff of my current version vs. the version from “a little while ago”, even though neither is committed to the main line.

  28. Somehow I just can’t find it in me to totally trust Source control packages. Always have a manual system to copy to a separate drive as well. Unfortunately my lack of trust has been proven more than once!!!!

  29. Have you taken a look at StarDock’s Keepsafe tool? I’ve never used it, but with the caveat that it stores old versions in its own system instead of on a private branch in your SCC tool it sounds like it should be able to do a lot of what you want by archiving every version you save.

    http://www.stardock.com/products/keepsafe/

  30. I think it’s more of a religious issue than a practical one.

    If it were practical, most SCMs out there wouldn’t exist today.

    Why? Because SCM users are very religious about what they do, no matter what it takes to actually save a change, even if it amounts to nothing less than typing 4 sets of command line sentences.

    This brings us back to Dropbox, which, unlike the OP, I do endorse as a valid SCM system. I don’t need anyone to store bug fixes separately from, I’m OK with just saving it inside comments. At the end of the day, instead of having a program running a DB check at god-knows-where-server, I get a tool checking out MY code. Couple that with file versioning, syncing and diff support, and we’re all happy.

  31. Interesting, I would love the ability to have the “dropbox_-like capabilities suggested by the author COUPLED with the ability to inject comments regarding each “checkin”.

  32. So what we’re talking about is storing the chain of undo buffers from your IDE in a persistent store. In other words, meta-source control.

  33. I would find it benefical… even if people see it as a mere backup, having it integrated on a laready installed tool (like VS + TeamExplorer) and having the backups on TFS database in a way they can be explored and merged on ths Source Control, would be a nice addition.

  34. There is already software that can do this called AJC Active Backup. See http://www.ajcsoft.com/active-backup.htm.
    This monitors files on your hard drive and stores a revision in its revision control system each time you save a file. It can work with many types of file, not just source code but it is most use with source code because you can create a diff at any point to see what you have changed etc.

  35. The IBM Rational Team Concert version system actually does that. The repository workspace, which is comparable to the Git index/stage, lives on the server. Each time you save, you backup to the server as well (if you enabled check-in to repo automatically). Since RTC actually allows you to work with many pending changesets at once in the index, this is less intrusive than you’d might imagine. Similar to AccuRev.

    I am also in no way affiliated with IBM or RTC, just a user.

  36. I think what is really being said here is that “Save” is broken. This complaint is not just for source code but for all document formats or even more generally, the file system is broken. People routinely use multiple devices and need to sync and backup but the process is tedious so there is resistance and loss of data.

  37. What you are looking for is called a log structured file system. While they look very similar to VCS on the details, they are build for a completely different (your?) use case.

    Have a look at NILFS

  38. if you use and IDE such as netbeans or eclipse that has a “local history” and combine that with syncing your code in something like Dropbox you have a local “bookshelf” of your trigger happy Ctrl-S action your commits with some sort of documentation to make real past retrieval convenient and of course pushing upstream when your stable

  39. I’m not on board with “autosaving” files to source control.

    The biggest reason for this is my usual workflow I don’t commit files, I commit contexts, behaviors, or concepts. I want the history of the behaviors of my application as they were written, not a time stamp of when I saved the file.

    The other thing the auto commit would seem to do away with is comments. Yeah, I know many a developer who has shunned the commit comment, or gone around the required comment by committing something like “.” or “done” which is pointless. But meaningful comments are a help to teams that use them.

    Source control is one of our tools, one that can do many things beyond just keep the latest code safe on a server. I’m all for tools that are painless to use, but in this case I don’t want it so transparent it’s forgotten because it can be powerful.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s