Note that because of nature of project it could be very dependant to iterarions and no plan is strait forward for this type of projects.
Future Plans:
- add visual management system π
- back to the future β°β¬ οΈπ add time travel to memory system and agent
- nerds with needles π€π knowledge injection
- Assembly of philosophers and a chalice(π€)*π₯ agent orchestration
Welcome to the Kaggle Problem Solver, the Swiss Army knife of machine learning challenges! This isn't just any old problem solver β it's your AI-powered companion in the wild world of Kaggle competitions. Using a "plan and execute" strategy that would make any project manager jealous, our system tackles ML problems with the finesse of a seasoned data scientist and the tireless energy of a thousand interns. code generation agent is inspired from langgraph agent link
- The Mastermind (KaggleProblemPlanner): Plans your path to Kaggle glory!
- The Perfectionist (KaggleTaskEnhancer): Turns good tasks into great ones!
- The Code Wizard (CodeGenerationAgent): Conjures code like magic! β¨
- The Strategist (KaggleProblemRePlanner): Adapts faster than a chameleon in a rainbow!
- The Executor (KaggleCodeExecutor): Runs code faster than you can say "machine learning"!
- Scrape β 2. Data analyze β 3. Plan β( 4. Enhance β 5. Code β 6. Execute )β Repeat!
It's like a never-ending dance party, but with more algorithms and less awkward small talk.
Behold, the piΓ¨ce de rΓ©sistance of our project β the Agent Graph! π
graph TB
%% Define styles
style A fill:#f9f,stroke:#333,stroke-width:2px
style H fill:#f9f,stroke:#333,stroke-width:2px
style B fill:#bbf,stroke:#333,stroke-width:1px
style C fill:#cfc,stroke:#333,stroke-width:1px
style D fill:#fcc,stroke:#333,stroke-width:1px
style E fill:#ffc,stroke:#333,stroke-width:1px
style F fill:#ccf,stroke:#333,stroke-width:1px
style G fill:#fcf,stroke:#333,stroke-width:1px
A((Start)) --> B[Scraper]
B --> G[Data Utils]
G --> D[Planner]
D --> F[Enhancer]
F --> C[Code Agent]
C --> E[Executor]
H((Finish))
subgraph Code_Agent_Process [Code Agent Process]
style Code_Agent_Process fill:#cfc,stroke:#333,stroke-width:1px
I((Start))
J[Generate Code]
K{Is Code Valid?}
L((Finish))
I --> J
J --> K
K -- Yes --> L
K -- No --> J
end
%% Link the main process to subgraph
C -->|Initiates| I
L -->|Returns| E
%% Annotations
classDef annotation fill:#fff,stroke:none,color:#333,font-size:12px;
class B,G,D,F,C,E annotation;
%% Annotating Feedback Loops
E -. Feedback Loop .-> F
E -. Completion .-> H
This isn't just any graph β it's a visual symphony of our agents working in harmony. Watch as data flows through our system like a well-choreographed ballet of bits and bytes!
-
Clone this repo faster than you can say "git":
git clone https://github.com/msnp1381/kaggle-agent.git
-
Start the required services using Docker Compose:
docker-compose up -d
-
Install Poetry if you haven't already:
curl -sSL https://install.python-poetry.org | python3 -
-
Set up the Python environment:
poetry install
-
Configure the project:
-
Copy the
.env.template
file to.env
:cp .env.template .env
-
Open the
.env
file and fill in the required environment variables. -
Review and update the
config.ini
file if necessary.
-
-
Run the main script:
poetry run python main.py
The Kaggle Problem Solver can be customized using the config.ini
file. This file allows you to adjust various settings without modifying the code directly. Here's how you can change the configuration:
-
Open the
config.ini
file in a text editor. -
Modify the values as needed. Here are some key sections and their purposes:
[General] recursion_limit = 50 # Set the maximum recursion depth [API] base_url = https://api.avalapis.ir/v1 # API base URL model = gpt-4o-mini # Choose the AI model temperature = 0.0 # Set the creativity level (0.0 for deterministic, higher for more randomness) [Kaggle] default_challenge_url = https://www.kaggle.com/c/spaceship-titanic/ # Default Kaggle challenge URL [Logging] level = INFO # Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) file = kaggle_solver.log # Log file name [Jupyter] server_url = http://127.0.0.1:8888/ # Jupyter server URL # token = your_jupyter_token # Uncomment and set your Jupyter token if required [MongoDB] db_name = challenge_data # MongoDB database name ```
-
Save the file after making your changes.
The configuration is loaded when the Kaggle Problem Solver starts. To apply changes:
- Stop the current execution of the Kaggle Problem Solver.
- Restart the application to load the new configuration.
Customize your config like you're picking toppings for a pizza:
config = {
"callbacks": [langfuse_handler],
}
- Create new agents like you're assembling an AI Avengers team.
- Integrate them into
agent.py
β it's like introducing your new friends to your old crew. - Update
KaggleProblemSolver
to include your new agent in the coolest workflow in town.
Got ideas? We want them! Check out CONTRIBUTING.md
for how to join our merry band of AI enthusiasts. Remember, in this repo, there are no bad ideas, only "learning opportunities"!
Stay connected and collaborate with fellow enthusiasts in our Telegram group: Join here
This project is licensed under the MIT License - see the LICENSE
file for details. In other words, go wild, but don't forget to give us a high-five if you use it!
Our Kaggle Problem Solver comes equipped with a sophisticated memory system that acts as the brain of our AI, allowing it to learn, adapt, and make informed decisions throughout the problem-solving process. Here's how it works:
-
Short-Term Memory:
- Stores recent interactions and important information
- Uses a weighted system to prioritize crucial data
- Helps maintain context during the problem-solving process
-
Long-Term Memory:
- Utilizes Chroma, a vector database, for efficient storage and retrieval
- Stores documents, code snippets, and execution results
- Enables semantic search for relevant information
-
Examples Memory:
- Stores successful task executions (task, code, result)
- Used for few-shot learning and providing relevant examples
The memory agent continuously updates a summary of the project's progress:
updated_summary = memory_agent.update_summary(task, code, result)
Combines short-term and long-term memory for informed responses:
answer = memory_agent.ask("What are the key points of the challenge?")
Finds relevant information based on meaning, not just keywords:
relevant_docs = memory_agent.search_documents("AI advancements", doc_type="tech_report")
Retrieves similar examples to guide new task executions:
few_shot_examples = memory_agent.get_few_shots(task, n=4)
Adds and retrieves documents with metadata:
doc_id = memory_agent.add_document("Document content", "doc_type", {"metadata": "value"})
document = memory_agent.load_document(doc_id)
Adds important information to short-term memory with priority:
memory_agent.add_to_short_term_memory("Important info", importance=1.5)
# Retrieve relevant examples
few_shot_examples = memory_agent.get_few_shots(current_task, n=4)
# Access documentation
relevant_docs = memory_agent.search_documents(query, doc_type="documentation")
# Maintain context
memory_agent.add_to_short_term_memory(f"Generated code: {code}", importance=1.5)
# Add executed task to examples
memory_agent.add_example(task, code, result)
# Initialize document retrieval
memory_agent.init_doc_retrieve()
# Access challenge information
challenge_info = memory_agent.ask(f"What are the key points of the {challenge_name} challenge?")
# Retrieve relevant context
relevant_context = memory_agent.ask_docs(current_task)
# Add enhanced task to memory
memory_agent.add_to_short_term_memory(str(enhanced_task))
By leveraging this powerful memory system across all components, our Kaggle Problem Solver becomes more than just a code generator β it's a learning, adapting, and evolving AI partner in your machine learning journey!