Pair Browsing - AI-Powered Browser Assistant

A Chrome extension that provides an AI-powered assistant to help you navigate and interact with web pages through natural language commands.

Quick Demo

Features

🤖 Natural Language Control: Interact with web pages using simple text commands
🖱️ Visual Cursor: See where the AI assistant is clicking with an animated cursor
🔍 Smart Element Detection: Automatically identifies and interacts with clickable elements
⌨️ Form Filling: Fill out forms and input fields with natural language instructions
📜 Content Extraction: Extract page content in various formats (text, markdown)
🔄 Navigation: Search Google, navigate to URLs, and go back in history
⬆️ Scrolling: Control page scrolling with natural commands
⚡ Multiple AI Providers: Support for both OpenAI and Google Gemini

Available Actions

Click: Click on any interactive element on the page
Fill: Input text into form fields
Search Google: Perform Google searches directly
Navigate: Go to specific URLs or go back in history
Scroll: Scroll the page up or down
Send Keys: Send keyboard inputs to active elements
Extract Content: Get page content as text or markdown

Installation

Clone this repository or download the source code
Open Chrome and navigate to chrome://extensions/
Enable "Developer mode" in the top right corner
Click "Load unpacked" and select the directory containing the extension files

Configuration

Click the extension icon in Chrome to open the options page
Choose your preferred AI provider (OpenAI or Google Gemini)
Configure the API settings:

For OpenAI:

Enter your OpenAI API key
Optionally customize the model (default: gpt-4o-mini)

For Google Gemini:

Enter your Gemini API key
Optionally customize the model (default: gemini-2.0-flash-exp)

Usage

Click the extension icon to open the sidebar
Type your command in natural language (e.g., "Click the login button" or "Fill the email field with [email protected]")
The AI assistant will:
- Analyze the page structure
- Identify relevant elements
- Execute the requested action
- Provide visual feedback with the cursor

Debug Mode

Enable debug mode in the options page to:

View captured screenshots in the chat
See detailed logging information
Help troubleshoot interactions

Examples

Here are some example commands you can try:

"Click the sign up button"
"Fill the username field with johndoe"
"Search for Chrome extensions"
"Scroll down one page"
"Go back to the previous page"
"Extract the main content as markdown"

Requirements

Google Chrome browser
API key from either OpenAI or Google Gemini
Active internet connection

Technical Details

The extension consists of:

Content Script: Handles page interactions and element detection
Background Script: Manages AI communication and extension state
Sidebar Interface: Provides the chat interface for user commands
Options Page: Allows configuration of AI providers and settings

Privacy & Security

API keys are stored locally in Chrome storage
Screenshots are only used for AI analysis and are not stored
No data is collected or stored outside of your browser

Troubleshooting

If the extension isn't working as expected:

Check if your API key is correctly configured
Ensure you have an active internet connection
Try refreshing the page
Check the browser console for error messages
Enable debug mode for more detailed logging

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.vscode		.vscode
icons		icons
lib		lib
LICENSE		LICENSE
README.md		README.md
background.js		background.js
contentScript.js		contentScript.js
manifest.json		manifest.json
options.html		options.html
options.js		options.js
sidebar.css		sidebar.css
sidebar.html		sidebar.html
sidebar.js		sidebar.js
storage.js		storage.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pair Browsing - AI-Powered Browser Assistant

Quick Demo

Features

Available Actions

Installation

Configuration

For OpenAI:

For Google Gemini:

Usage

Debug Mode

Examples

Requirements

Technical Details

Privacy & Security

Troubleshooting

License

About

Releases

Packages

Languages

License

rbarazi/pair-browsing

Folders and files

Latest commit

History

Repository files navigation

Pair Browsing - AI-Powered Browser Assistant

Quick Demo

Features

Available Actions

Installation

Configuration

For OpenAI:

For Google Gemini:

Usage

Debug Mode

Examples

Requirements

Technical Details

Privacy & Security

Troubleshooting

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages