Connect to multiple databases, run and generate raw SQL queries, perform analysis and make visualization.
This package is currently in pre-release and is under active development.
- Expect Bugs: While the core functionality works, some features may not be fully stable.
- Frequent Updates: APIs and features are subject to change in upcoming versions.
- Contributions Welcome: Feedback and contributions are highly encouraged to help improve the package.
- 🚀 SqthonAI: Generate SQL queries using a Large Language Model (LLM) of your choice.
- 💀 Security Improvements.
- 🆕 New Features and Enhancements.
- 🙄 Custom Exceptions for Better Error Messages.
- ✅ Unit Test Coverage.
If you don't have mysql installed, you can download it from here. ⭐
https://github.com/HimrajDas/SQTHON.git
cd sqthon
Using Windows powershell
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -
Using Linux, macOS, Windows (WSL)
curl -sSL https://install.python-poetry.org | python3 -
Using pipx
pipx install poetry
poetry install
pip install git+https://github.com/HimrajDas/SQTHON
- set database passwords like this:
<username>password
✅
from sqthon import Sqthon
# Instantiate the class. Passwords gets fetch from the .env file (that's why you have to create it)
sq = Sqthon(dialect="mysql", user="root", host="localhost", service_instance_name="MySQL service instance name")
# Connects to a database
conn1 = sq.connect_to_database(database="dbname", local_infile=True) # local_infile controls the infile settings for the client.
conn2 = sq.connect_to_database("dbname")
# or you can connect like this:
conn3 = sq.connect_db.connect(database="dbname") # not preferred ❌.
If your MySQL server is not running then providing service_instance_name will start the server automatically. If you are not running the script as an administrator, it will ask for admin privilege to start the server.
dummy_conn = sq.connect_to_database(database="dummy")
# Suppose, You have a table named sales in the dummy database.
query = """
SELECT customer_name FROM sales;
"""
customer_names = dummy_conn.run_query(query=query) # it will return the result as pandas dataframe.
run_query have several params other than query, they are: visualize: bool = False, plot_type: str = None, x=None, y=None, title=None. If you make visualize=True and provide x, y and plot_type args then it will return a graph along with the data which I don't think is good for later use of the variable.
from sqthon.data_visualizer import DataVisualizer as dv
conn1 = sq.connect_to_database("store_sales", infile=True)
query = """
SELECT YEAR(sales_month) as sales_year,
SUM(sales) AS sales,
kind_of_business
FROM us_store_sales
WHERE kind_of_business IN ('Men''s clothing stores', 'Women''s clothing stores', 'Family clothing stores')
GROUP BY sales_year, kind_of_business;
""" # a query I performed on my database 😁
yearly_sales = conn1.run_query(query=query)
dv.plot(data=yearly_sales, plot_type="line", x="sales_year", y="sales", hue="kind_of_business")
I have isolated this feature for several security reasons. What do I mean is that it uses a separate engine to import the csv to a table which you don't need to worry about 😎
Currently, it supports mysql only.
- csv_path: str
- table: str
- lines_terminated_by: str
In windows lines_terminated_by is generally '\r\n,' though you should inspect it before trying to import.
table: table name, if it doesn't exist then it will create the table according to the csv file. You don't need to worry about data types. It will handle it.
To import a file to mysql, you need to enable global infile both in server and client. In client it turns on when you set infile=True in connect to database.
sq.server_infile_status() # Returns True if it's on.
sq.global_infile_mode(mode="on") # mode accepts one of two values only: "on" or "off"
Let's import it.
conn1 = sq.connect_to_database(database="example_db", local_infile=True) # local_infile = True
# if global infile is off in the server, then you have to turn it on. Just do this:
sq.global_infile_mode("on")
conn1.import_csv_to_mysqldb(csv_path="/path/to/csv", table="dummy", lines_terminated_by="\n")
# tip: you can use hex editor to analyze the csv file. If it have 0D 0A after end of the row, then
# it's terminated by '\r\n'
You can use generate_date_series to create date series like postgres generate_series.