

- Closed source code repository tools how to#
- Closed source code repository tools archive#
- Closed source code repository tools software#
- Closed source code repository tools Offline#
KibbleĪpache Kibble is a suite of tools for collecting, aggregating and visualizing activity in software projects.
Closed source code repository tools archive#
GH Archive aims at providing a more exhaustive collection of events while GH Torrent makes a stronger effort in giving you the events data in a slightly more structured way to make it easier for you to get all the information surrounding the event. It then stores the JSON responses to a MongoDB database, while also extracting their structure in a MySQL database.Īs you can see, its goal is similar to GH Archive. For each event, it retrieves its contents and their dependencies, exhaustively. GHTorrent monitors the Github public event timeline.
Closed source code repository tools Offline#
GH Archive stores all GitHub events in a set of JSON files that you can later download and process offline as you wish.Īlternatively, GH Archive is also available as a public dataset on Google BigQuery: the dataset is automatically updated every hour and enables you to run arbitrary SQL-like queries over the entire dataset in seconds. GitHub Archive is a project to record the public GitHub timeline, archive it, and make it easily accessible for further analysis. Note that the previous rate limits still apply but GHCrawler employs token pooling and rotation to optimize the use of your API tokens (if you’re able to collect several ones from “friends and family”). GHCrawler is especially useful if you want to keep track of a set of orgs and repositories. The GHCrawler is a robust GitHub API crawler that walks a queue of GitHub entities transitively retrieving and storing their contents.

if you want to know what lines of code were modified during the last day). Keep in mind that, via this API, you can access basically all the info you see when browsing the GH repo of the project but you have a limited perspective on the internals of the “Git side of the project” (e.g. This is the strategy we use in our stargazer bot. But if you want to build some kind of dashboard focused on a single project or contributor, this is more than enough. One nice aspect is that you can also subscribe to get notified after certain events occur in a project. Unfortunately, there is a limit to the hourly number of requests so using the API is not a good solution if you’re looking to analyze large projects (or do some global analysis on a number of them). GitHub itself offers a public API to query any project. As usual, this post does not pretend to be an exhaustive and perfect analysis of the tools but just a way to sort out a little bit the myriad of notes and thoughts I had written down in several places. Let’s see the Git and/or GitHub analysis tools I know (and let me know the ones I may be missing). Depending on your scenario one of them may be enough. While I haven’t found the perfect tool (for me), at least we do have a number of good tools that will help you prepare this kind of ETL process for software data.
Closed source code repository tools how to#
