Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stargazers_count attribute of repositories #4

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 49 additions & 11 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,10 @@ https://knowyourmeme.com/memes/this-is-fine
- [planned features](#planned-features)
- [Prerequisities](#prerequisities)
- [Run](#run)
- [Configuration](#configuration)
- [Sleep time](#sleep-time)
- [Run in Docker](#run-in-docker)
- [Download profiles](#download-profiles)
- [Run locally](#run-locally)
- [examples](#examples)
- [Search in result files (saved profiles)](#search-in-result-files-saved-profiles)
- [examples](#examples-1)
Expand Down Expand Up @@ -84,6 +86,28 @@ or have it in your `.zshrc` 🤗 or whatever your shell loads at start

## Run

Here's what you need to get the thing running.

1. Configuration (optional)
2. Run in Docker
3. Run locally

### Configuration

Currently, the only configuration you can do is setting sleep time between request cycles.

#### Sleep time

**DEFAULT** sleep time is 30 seconds.

Increase the sleep time to avoid hitting the GitHub API rate limit.

You can customise the sleep time between cycles by setting the `SLEEP_TIME_SECONDS` environment variable.

```bash
$ SLEEP_TIME_SECONDS=15 bb scrape <location-like-city-or-country> <language>
```

### Run in Docker

All of the following should work in Docker, too.
Expand All @@ -97,17 +121,21 @@ $ docker run -it --rm git-hire

If you need to store the profiles, you can mount a docker volume, but this goes beyond the scope of this README.

### Download profiles
### Run locally

`$ bb scrape <location-like-city-or-country>`
```bash
$ bb scrape <location-like-city-or-country>
```

Will save the github profiles as `.edn` into the `profiles` directory,
**but** as GitHub support let me know:
> When using the language qualifier when searching for users, it will only return users where the majority of their repositories use the specified language. (please, see [documentation](https://docs.github.com/en/search-github/searching-on-github/searching-users#search-by-repository-language))

Specify further adding a language:

`$ bb scrape <location-like-city-or-country> <language>`
```bash
$ bb scrape <location-like-city-or-country> <language>
```

**Be warned!** This might not find a PHP dev who switched to Rust recently, as described by GitHub's Support.

Expand All @@ -120,7 +148,7 @@ After having built a pool of profiles, use
#### examples

`$ bb scrape mainz`
`$ bb scrape "Bad Schwalbach"`
`$ bb scrape "Bad Kreuznach"`
`$ bb scrape wiesbaden java`
`$ bb scrape wiesbaden php`
`$ bb scrape mainz javascript`
Expand All @@ -137,19 +165,29 @@ After having built a pool of profiles, use

you might go further, by piping to bb again, unimaginable possibilities...

`$ mkdir rails; cp $(grep -Zril rails profiles) rails`
```bash
$ mkdir rails; cp $(grep -Zril rails profiles) rails
```

and then:

`$ bb search-keyword "ios" | bb -e '(map #(str/upper-case %) *input*)'`
```bash
$ bb search-keyword "ios" | bb -e '(map #(str/upper-case %) *input*)'
```

### Inspect Profiles (with examples! 🤯)

`$ bb read-profile.clj simonneutert`
```bash
$ bb read-profile.clj simonneutert
```

go further, by piping
go further, by piping:

`$ bb read-profile.clj simonneutert | bb -e '(:languages *input*)'`
```bash
$ bb read-profile.clj simonneutert | bb -e '(:languages *input*)'
```

read many profiles
then read many profiles

```bash
$ bb search-keyword ruby | bb -e '(mapv #(edn/read-string (slurp %)) *input*)'
Expand Down
15 changes: 12 additions & 3 deletions src/git_hire/main.clj
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@
(def user-search-path
"/search/users")

(def default-sleep-time "30")

(def sleep-time
(let [sleep-time (or
(System/getenv "SLEEP_TIME_SECONDS")
default-sleep-time)]
(* (Integer/parseInt sleep-time) 1000)))

(defn ->utf8
[s]
(URLEncoder/encode s "UTF-8"))
Expand Down Expand Up @@ -124,7 +132,7 @@
runs (per-page->runs total-user-count per-page)
users (:items res)]
(if (> total-user-count 1000)
(do (Thread/sleep (* 4 1000))
(do (Thread/sleep (* sleep-time 1000))
(recur location lang (+ 1 more-repos-than)))
(if (> runs 1)
(do (prn "getting users with more than " more-repos-than " repos")
Expand All @@ -147,7 +155,7 @@
runs (per-page->runs total-user-count per-page)
users (:items res)]
(if (> total-user-count 1000)
(do (Thread/sleep (* 4 1000))
(do (Thread/sleep (* sleep-time 1000))
(recur location (+ 1 more-repos-than)))
(do (file-path-location-all location)
(if (> runs 1)
Expand All @@ -161,7 +169,7 @@

(defn repo-slim
[user-repo]
(select-keys user-repo [:html_url :name :description :homepage :topics :language :updated_at]))
(select-keys user-repo [:html_url :name :description :homepage :topics :language :stargazers_count :updated_at]))

(defn repos-slim
[user-repos]
Expand All @@ -180,6 +188,7 @@
{:name (get-in first-repo [:owner :login])
:owner_url (get-in first-repo [:owner :html_url])
:languages (user-languages cleaned-repos)
:total-stars (reduce + (map :stargazers_count cleaned-repos))
:repositories cleaned-repos}))

(defn recursive-curl
Expand Down
19 changes: 12 additions & 7 deletions test/git_hire/test_main.clj
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
(ns git-hire.test-main)
(ns git-hire.test-main
(:require
[clojure.test :as t]))

(require '[clojure.test :as t]
'[babashka.classpath :as cp]
'[git-hire.main :as main])
Expand Down Expand Up @@ -46,6 +49,7 @@
:homepage "www.foo.bar"
:topics ["foo" "bar"]
:language "clojure"
:stargazers_count 10
:updated_at "2020-01-01T00:00:00Z"}]
(main/repos-slim [{:name "foo"
:html_url "bar"
Expand All @@ -68,6 +72,7 @@
:homepage "www.foo.bar"
:topics ["foo" "bar"]
:language "clojure"
:stargazers_count 10
:updated_at "2020-01-01T00:00:00Z"}
(main/repo-slim {:name "foo"
:html_url "bar"
Expand Down Expand Up @@ -115,18 +120,18 @@
:pizza "turtles"}])))))

(t/deftest user-location-search-params-location
(t/is (= {:query-params {"per_page" 10, "q" "location:\"bad+kissingen\" repos:>=0"}}
(main/user-location-search-params-location 10 0 "Bad Kissingen")))
(t/is (= {:query-params {"per_page" 10, "q" "location:\"bad+kreuznach\" repos:>=0"}}
(main/user-location-search-params-location 10 0 "Bad Kreuznach")))
(t/is (= {:query-params {"per_page" 20, "q" "location:\"mainz\" repos:>=0"}}
(main/user-location-search-params-location 20 0 "Mainz"))))

(t/deftest file-path-location-all
(t/is (= "./profiles/mainz/all/"
(main/file-path-location-all "Mainz")))
(t/is (= "./profiles/bad kissingen/all/"
(main/file-path-location-all "Bad Kissingen"))))
(t/is (= "./profiles/bad kreuznach/all/"
(main/file-path-location-all "Bad Kreuznach"))))

(t/deftest user-location-search-params-location-lang
(t/is (= {:query-params {"per_page" 10,
"q" "location:\"bad+kissingen\" repos:>=0 language:\"clojure\""}}
(main/user-location-search-params-location-lang 10 0 "Bad Kissingen" "clojure"))))
"q" "location:\"bad+kreuznach\" repos:>=0 language:\"clojure\""}}
(main/user-location-search-params-location-lang 10 0 "Bad Kreuznach" "clojure"))))