Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Lovely! I only learnt today that clickhouse has a git-import tool from my colleagues at ClickHouse. So if you also want to give it a go:

Download clickhouse: curl https://clickhouse.com/ | sh

Check out documentation for git-import: ./clickhouse git-import --help

Then the tool can be run directly inside the git repository. It will collect data like commits, file changes and changes of every line in every file for further analysis. It works well even on largest repositories like Linux or Chromium.

Example of a trivial query:

SELECT author AS k, count() AS c FROM line_changes WHERE file_extension IN ('h', 'cpp') GROUP BY k ORDER BY c DESC LIMIT 20

Example of some non-trivial query - a matrix of authors, how much code of one author is removed by another:

SELECT k, written_code.c, removed_code.c, round(removed_code.c * 100 / written_code.c) AS remove_ratio FROM ( SELECT author AS k, count() AS c FROM line_changes WHERE sign = 1 AND file_extension IN ('h', 'cpp') AND line_type NOT IN ('Punct', 'Empty') GROUP BY k ) AS written_code INNER JOIN ( SELECT prev_author AS k, count() AS c FROM line_changes WHERE sign = -1 AND file_extension IN ('h', 'cpp') AND line_type NOT IN ('Punct', 'Empty') AND author != prev_author GROUP BY k ) AS removed_code USING (k) WHERE written_code.c > 1000 ORDER BY c DESC LIMIT 500



> Download clickhouse: curl https://clickhouse.com/ | sh

Does this check the useragent to change the response? Clicking that link shows their home page.


that is exaxtly what it does ;) if you don't feel comfortable with curl | sh , you can download clickhouse binary from the repo here https://github.com/ClickHouse/ClickHouse/releases

;)


Changing the content from an html page to a shell script based on user-agent is a pretty bad abuse of HTTP. Why not at least require `-H 'Accept: text/x-shellscript'`? Or be more basic and give the script its own URL


Based on what reasoning? (Honestly curious)


If I want to download your homepage with curl to read offline, I get a script? If I use a tool you don't know you get the installer, I execute HTML?

If I run curl on Windows, do I get this script? A PowerShell version?

Why not make it https://clickhouse.com/linux-installer?


These are totally legit concerns, while the behaviour of the site has been around for quite sometimes and many ClickHouse installation script may have them so we will keep it for backward compatibility, we will add the usual install.sh url later and start sharing them more often.

(Pull request is in ... it should be deployed on Monday and you can use https://clickhouse.com/install.sh ). Love the feedbacks, please keep them coming!


Because someone may want to preview the script in browser.

Because someone may not have curl and use another tool your server doesn't know.


To what Resourse does this URL (universal Resourse locator) refer? A web page or a script?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: