Skip to content

Commit

Permalink
Add stargazers_count attribute of repositories, allow customizing sle…
Browse files Browse the repository at this point in the history
…ep time to prevent API rate limit errors
  • Loading branch information
simonneutert committed Nov 30, 2024
1 parent 5e1a0f8 commit ddb32a2
Show file tree
Hide file tree
Showing 3 changed files with 61 additions and 13 deletions.
58 changes: 48 additions & 10 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,10 @@ https://knowyourmeme.com/memes/this-is-fine
- [planned features](#planned-features)
- [Prerequisities](#prerequisities)
- [Run](#run)
- [Configuration](#configuration)
- [Sleep time](#sleep-time)
- [Run in Docker](#run-in-docker)
- [Download profiles](#download-profiles)
- [Run locally](#run-locally)
- [examples](#examples)
- [Search in result files (saved profiles)](#search-in-result-files-saved-profiles)
- [examples](#examples-1)
Expand Down Expand Up @@ -84,6 +86,28 @@ or have it in your `.zshrc` 🤗 or whatever your shell loads at start

## Run

Here's what you need to get the thing running.

1. Configuration (optional)
2. Run in Docker
3. Run locally

### Configuration

Currently, the only configuration you can do is setting sleep time between request cycles.

#### Sleep time

**DEFAULT** sleep time is 30 seconds.

Increase the sleep time to avoid hitting the GitHub API rate limit.

You can customise the sleep time between cycles by setting the `SLEEP_TIME_SECONDS` environment variable.

```bash
$ SLEEP_TIME_SECONDS=15 bb scrape <location-like-city-or-country> <language>
```

### Run in Docker

All of the following should work in Docker, too.
Expand All @@ -97,17 +121,21 @@ $ docker run -it --rm git-hire

If you need to store the profiles, you can mount a docker volume, but this goes beyond the scope of this README.

### Download profiles
### Run locally

`$ bb scrape <location-like-city-or-country>`
```bash
$ bb scrape <location-like-city-or-country>
```

Will save the github profiles as `.edn` into the `profiles` directory,
**but** as GitHub support let me know:
> When using the language qualifier when searching for users, it will only return users where the majority of their repositories use the specified language. (please, see [documentation](https://docs.github.com/en/search-github/searching-on-github/searching-users#search-by-repository-language))
Specify further adding a language:

`$ bb scrape <location-like-city-or-country> <language>`
```bash
$ bb scrape <location-like-city-or-country> <language>
```

**Be warned!** This might not find a PHP dev who switched to Rust recently, as described by GitHub's Support.

Expand Down Expand Up @@ -137,19 +165,29 @@ After having built a pool of profiles, use

you might go further, by piping to bb again, unimaginable possibilities...

`$ mkdir rails; cp $(grep -Zril rails profiles) rails`
```bash
$ mkdir rails; cp $(grep -Zril rails profiles) rails
```

and then:

`$ bb search-keyword "ios" | bb -e '(map #(str/upper-case %) *input*)'`
```bash
$ bb search-keyword "ios" | bb -e '(map #(str/upper-case %) *input*)'
```

### Inspect Profiles (with examples! 🤯)

`$ bb read-profile.clj simonneutert`
```bash
$ bb read-profile.clj simonneutert
```

go further, by piping
go further, by piping:

`$ bb read-profile.clj simonneutert | bb -e '(:languages *input*)'`
```bash
$ bb read-profile.clj simonneutert | bb -e '(:languages *input*)'
```

read many profiles
then read many profiles

```bash
$ bb search-keyword ruby | bb -e '(mapv #(edn/read-string (slurp %)) *input*)'
Expand Down
14 changes: 11 additions & 3 deletions src/git_hire/main.clj
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@
(def user-search-path
"/search/users")

(def default-sleep-time "30")

(def sleep-time
(let [sleep-time (or
(System/getenv "SLEEP_TIME_SECONDS")
default-sleep-time)]
(* (Integer/parseInt sleep-time) 1000)))

(defn ->utf8
[s]
(URLEncoder/encode s "UTF-8"))
Expand Down Expand Up @@ -124,7 +132,7 @@
runs (per-page->runs total-user-count per-page)
users (:items res)]
(if (> total-user-count 1000)
(do (Thread/sleep (* 4 1000))
(do (Thread/sleep (* sleep-time 1000))
(recur location lang (+ 1 more-repos-than)))
(if (> runs 1)
(do (prn "getting users with more than " more-repos-than " repos")
Expand All @@ -147,7 +155,7 @@
runs (per-page->runs total-user-count per-page)
users (:items res)]
(if (> total-user-count 1000)
(do (Thread/sleep (* 4 1000))
(do (Thread/sleep (* sleep-time 1000))
(recur location (+ 1 more-repos-than)))
(do (file-path-location-all location)
(if (> runs 1)
Expand All @@ -161,7 +169,7 @@

(defn repo-slim
[user-repo]
(select-keys user-repo [:html_url :name :description :homepage :topics :language :updated_at]))
(select-keys user-repo [:html_url :name :description :homepage :topics :language :stargazers_count :updated_at]))

(defn repos-slim
[user-repos]
Expand Down
2 changes: 2 additions & 0 deletions test/git_hire/test_main.clj
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
:homepage "www.foo.bar"
:topics ["foo" "bar"]
:language "clojure"
:stargazers_count 10
:updated_at "2020-01-01T00:00:00Z"}]
(main/repos-slim [{:name "foo"
:html_url "bar"
Expand All @@ -68,6 +69,7 @@
:homepage "www.foo.bar"
:topics ["foo" "bar"]
:language "clojure"
:stargazers_count 10
:updated_at "2020-01-01T00:00:00Z"}
(main/repo-slim {:name "foo"
:html_url "bar"
Expand Down

0 comments on commit ddb32a2

Please sign in to comment.