How to open-source your repo

(without open-sourcing your repo)


presented by

Greg Hurrell

Why?

  • Puts code and practices out in front of potential hires
  • Raises the tech profile of the organization
  • Writing for a public audience holds us to higher standards
  • Opens the door to contributions and bugfixes by third parties
  • Delivers many benefits of open source immediately, without having to go through an arduous open-sourcing process

The plan

  • Automatically maintain an open-source fork
  • A subset of a larger application
    • app/assets/
    • public/
    • spec/javascripts/

Overview

Gotchas

  • Filtered commits may not have stable patch IDs
  • You can't infer the merge base for rebasing just by looking at the shape of the DAG
  • Boundary conditions can occur when the commit range you're operating on is adjacent to a pruned commit

The post-receive hook

#!/bin/sh

while read OLD NEW REF; do
  if [ "$REF" = 'refs/heads/master' ]; then
    hooks/post-receive-helper.sh $OLD $NEW
  fi
done

Listing all the files in a commit

git ls-tree -r                      \
            --name-only             \
            --full-tree $GIT_COMMIT

Inverting our whitelist to produce a blacklist

egrep -v "^(public/|app/assets/|spec/javascripts/)"

Removing blacklisted files

git rm -r                         \
       --quiet                    \
       --cached                   \
       --ignore-unmatch -- $FILES

Filtering a range of commits

git filter-branch -f --prune-empty --index-filter \
    '
      git rm -r
             --quiet
             --cached
             --ignore-unmatch --
        $(git ls-tree -r
                      --name-only
                      --full-tree $GIT_COMMIT |
          egrep -v
            "^(public/|app/assets/|spec/javascripts/)")
    ' $BASE~1..filtered-master

Choosing $BASE

# overshoot (instead of $OLD~1) because we want to
# make sure we produce at least one patch-id that
# we'll be able to match against what is already
# in "open"
BASE=$(git rev-list --max-count=2 $OLD~1 --
       app/assets/
       public/
       spec/javascripts/ | tail -1)

Short-circuiting

CANDIDATES=$(git rev-list $OLD..$NEW --
             app/assets/
             public/
             spec/javascripts/)
if [ -z "$CANDIDATES" ]; then
  # Nothing newsworthy here; bail.
  exit
fi

Choosing a merge base

# determine the first commit in the range
# OLD..NEW which survived filtering and is
# not already on "open"
OPEN=$(git format-patch -k --stdout open~1..open |
       git patch-id |
       head -1 |
       cut -f 1 -d ' ')

BASE=$(git format-patch -k --stdout
       $BASE~1..filtered-master |
       git patch-id |
       grep $OPEN |
       cut -f 2 -d ' ')

Transplanting the filtered history

# rebase interesting part of filtered-master
# branch (subset of OLD..NEW range) onto "open"
git rebase --onto open $BASE filtered-master
git branch -f open filtered-master

Publishing

# push "open" to GitHub
# (fast-forward only, in case we screw up)
git push github open:open

Next steps

  • Monitor and verify operation of system
  • Build team awareness of implications
  • Plan for expanding whitelist

The end