HOME
A gentle introduction with a brief code example to jump into GitPython.
Gitpython is one of the most popular python libraries that gives the ability to interact with Git. You can use GitPython pure Python function or GitPython git command implementation.
To install GitPython, go to terminal and run:
$ pip install gitpython
The first thing that we need to do in every git operation is to create a repository. So let's start.
Create awesome-project
directory then pass its path to Repo.init()
function.
>>> # Initialize a git repository
>>> repo = git.Repo.init("~/awesome-project")
>>> repo
<git.Repo "/home/khawarizmi/awesome-project/.git">
You can skip creating a directory manually by setting mkdir
argument to True
repo = git.Repo.init("~/great-project/", mkdir=True)
If you had an existing git repository, you can create a repo instance using Repo.__init__()
method
>>> # Create a Repo instance
>>> repo = git.Repo("~/old-project/")
If you are unable to specify the root directory of your project, you can pass any sub-directory. Gitpython will find the root directory for you.
>>> repo = git.Repo("~/awesome-project/docs", search_parent_directories=True)
>>> repo
<git.Repo "/home/khawarizmi/awesome-project/.git">
In the previous chapter, we learn that GitPython offers pure python function or git command implementation. The latter is faster but more resource-intensive. In this step, I will use both of them to give you a closer look at how to do things in both ways.
We don't have anything yet.
>>> repo.git.status()
'On branch master\n\nNo commits yet\n\nnothing to commit (create/copy files and use "git add" to track)'
So let's start adding a file named hello.py
. Now If we check our repo status, GitPython tells us that we have a new file.
>>> status = repo.git.status()
>>> print(status)
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
hello.py
nothing added to commit but untracked files are present (use "git add" to track)
Now let's add them to our repository index using repo.git.add()
:
>>> repo.git.add("hello.py")
If we check our current repository status, GitPython tells us that we have added hello.py
to repository index:
>>> status = repo.git.status()
>>> print(status)
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: hello.py
We sure nothing more to add to hello.py
, now it's time to commit our changes:
>>> repo.git.commit(m="first commit")
'[master (root-commit) 0e0f6c2] first commit\n 1 file changed, 1 insertion(+)\n create mode 100644 hello.py'
As far as I know, GitPython pure function didn't have a similar operation as git.status
. But we can leverage the use of index.diff()
:
>>> # list of untracked files
>>> repo.untracked_files
[]
>>> # diff between the index and the working tree
>>> repo.index.diff(None)
[]
>>> # diff between the index and the commit’s tree
>>> repo.index.diff(repo.head.commit)
The last command will raise an error since we don't have any commit yet.
After adding hello.py
we can check for any untracked files:
>>> repo.untracked_files
['hello.py']
Now let's add them to our repository index:
>>> repo.index.add('hello.py')
>>> # as always, you can inspect the return value
>>> add = repo.index.add('hello.py')
>>> add
[(100644, 8cde7829c178ede96040e03f17c416d15bdacd01, 0, hello.py)]
Let's check what we have added:
>>> len(repo.untracked_files)
0
>>> # get staged files
>>> staged = repo.index.diff("HEAD")
>>> len(staged)
1
We have added one file, and no untracked files left. Then the next step is creating a commit:
>>> repo.index.commit("first commit")
<git.Commit "b645f6e5584dce8dadeb268f731d7eb99ab01422">
To see the history (log) of your project, you can use git.logs()
>>> log = repo.git.log()
>>> print(log)
commit b645f6e5584dce8dadeb268f731d7eb99ab01422
author: azzamsa <azzamsa@example.com>
date: Sun Feb 16 09:32:19 2020 +0700
first commit
Using equivalent pure function would be:
>>> log = master.log()
>>> log[0]
0000000000000000000000000000000000000000 b645f6e5584dce8dadeb268f731d7eb99ab01422 azzamsa <azzamsa@example.com> 1581820339 +0700 commit (initial): first commit
Info: Anytime you hesitate about what interesting value an object had, use a `dir()` function.
Let's check what interesting value we have in log object.
>>> dir(log[0])
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__module__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '_re_hexsha_only', 'actor', 'count', 'format', 'from_line', 'index', 'message', 'new', 'newhexsha', 'oldhexsha', 'time']
Then use them in our code:
>>> log[0].message
'commit (initial): first commit'
>>> log[0].newhexsha
'b645f6e5584dce8dadeb268f731d7eb99ab01422'
+++
Besides using log
objects to see your history, you can use commit
objects.
commits = list(repo.iter_commits("master", max_count=5))
>>> commits[0].author
<git.Actor "azzamsa <azzamsa@example.com>">
>>> commits[0].committed_datetime
datetime.datetime(2020, 2, 16, 9, 32, 19, tzinfo=<git.objects.util.tzoffset object at 0x7f8463f32cf8>)
>>> commits[0].hexsha
'b645f6e5584dce8dadeb268f731d7eb99ab01422'
>>> commits[0].message
'first commit'
To list your branches you can use:
>>> repo.branches
[<git.Head "refs/heads/master">, <git.Head "refs/heads/second-branch">,
<git.Head "refs/heads/third">]
>>> # or
>>> repo.heads
[<git.Head "refs/heads/master">, <git.Head "refs/heads/second-branch">, <git.Head "refs/heads/third">]
To see your active branch:
>>> repo.active_branch
<git.Head "refs/heads/master">
Then you can check out your branch using:
>>> repo.heads.third.checkout()
<git.Head "refs/heads/third">
>>> # or using git command implementation
>>> repo.git.checkout("third")
''
The caveat is you can't use a pure function to checkout the branch containing a dash. You can't do repo.heads.second-branch.checkout()
. You can leverage git command in this situation repo.git.checkout("second-branch")
.
If you find GitPython missing git functionality, you can always go back to GitPython git command implementation. The first step is you need to know what the command and parameters look like in git, then the second step is passing those parameters to the GitPython git command. Some of the examples:
Git log --oneline
$ git log --oneline b645f6e..86f3c62
86f3c62 (HEAD -> master) third commit
6240bd6 (third, second-branch) second commit
>>> logs = repo.git.log("--oneline", "b645f6e..86f3c62")
>>> logs
'86f3c62 third commit\n6240bd6 second commit'
>>> logs.splitlines()
['86f3c62 third commit', '6240bd6 second commit']
Git show
current content
$ git show 86f3c62:hello.py
print("hello world")
print("")
print("")
>>> content = repo.git.show("86f3c62:hello.py")
>>> print(content)
print("hello world")
print("")
print("")
Getting diffs:
$ git show 6240bd6 hello.py
commit 6240bd6a9111df3aa624f781ac8bad2cea551f8e (third, second-branch)
author: azzamsa <azzamsa@example.com>
date: Sun Feb 16 10:13:14 2020 +0700
second commit
diff --git a/hello.py b/hello.py
index 8cde782..057280e 100644
+++ a/hello.py
+++ b/hello.py
@@ -1 +1,2 @@
print("hello world")
+print("")
>>> diff = repo.git.show("6240bd6", "hello.py")
>>> print(diff)
commit 6240bd6a9111df3aa624f781ac8bad2cea551f8e
author: azzamsa <azzamsa@example.com>
date: Sun Feb 16 10:13:14 2020 +0700
second commit
diff --git a/hello.py b/hello.py
index 8cde782..057280e 100644
+++ a/hello.py
+++ b/hello.py
@@ -1 +1,2 @@
print("hello world")
+print("")
Git show
name only
>>> repo.git.show("--pretty=", "--name-only", "86f3c62")
'hello.py'
You can use config_writer()
to change repository configuration.
One of the examples is changing the committer username and email:
repo.config_writer().set_value("user", "name", "khwārizmī").release()
repo.config_writer().set_value("user", "email", "khwarizmi@example.com").release()
Here are some useful functions that I extract from my previous project, lupv:
# GPL-3.0
def read_file(self, filename, sha):
"""Get content of current file state."""
current_file = self._student_repo.git.show("{}:{}".format(sha, filename))
return current_file
def read_diff(self, filename, sha):
"""Get content of diff file."""
diff = self._student_repo.git.show(sha, filename)
return diff
def is_exists(self, filename, sha):
"""Check if filename in current record exist."""
files = self._student_repo.git.show("--pretty=", "--name-only", sha)
if filename in files:
return True
You can see another useful gist in my StackOverflow answers:
We don't cover everything here. You can dive deeper by reading GitPython Documentation. My favorite documentation is the test file, it covers many basic things to get you started.
If you liked this article, please support my work. It will definitely be rewarding and motivating. Thanks for the support!
Comments