mirror of
https://github.com/ItzCrazyKns/Perplexica.git
synced 2025-09-15 05:51:33 +00:00
Compare commits
54 Commits
56cd278ed4
...
develop/v1
Author | SHA1 | Date | |
---|---|---|---|
|
59ab10110a | ||
|
10f9cd2f79 | ||
|
82e1dd73b0 | ||
|
728b499281 | ||
|
5a4dafc753 | ||
|
4ac99786f0 | ||
|
1224281278 | ||
|
3daae29a5d | ||
|
50bcaa13f2 | ||
|
31e4abf068 | ||
|
fd6e701cf0 | ||
|
89880a2555 | ||
|
07776d8699 | ||
|
a24992a3db | ||
|
32fb6ac131 | ||
|
99137d95e7 | ||
|
490a8db538 | ||
|
aba702c51b | ||
|
89a6e7fbb1 | ||
|
f19d2e3a97 | ||
|
4a7ca8fc68 | ||
|
3d642f2539 | ||
|
aa91d3bc60 | ||
|
93c5ed46f6 | ||
|
af4b97b766 | ||
|
ca86a7e358 | ||
|
99351fc2a6 | ||
|
7a816efc04 | ||
|
d584067bb1 | ||
|
4d41243108 | ||
|
6c218b5fee | ||
|
1c1f31e23a | ||
|
5b15bcfe17 | ||
|
df4350f966 | ||
|
652ca2fdf4 | ||
|
216576128d | ||
|
bb3f180583 | ||
|
4d24d73161 | ||
|
2e166c217b | ||
|
4c73caadf6 | ||
|
5f0b87f4a9 | ||
|
115e6b2a71 | ||
|
a5c79c92ed | ||
|
db3cea446e | ||
|
8e683d266a | ||
|
e9ab425cee | ||
|
811c0c6fe1 | ||
|
cab1aa705c | ||
|
5cbc512322 | ||
|
41d056e755 | ||
|
07dc7d7649 | ||
|
7ec201d011 | ||
|
3582695054 | ||
|
46541e6c0c |
11
.env.example
11
.env.example
@@ -1,11 +0,0 @@
|
|||||||
PORT=3000
|
|
||||||
NODE_ENV=development
|
|
||||||
SUPABASE_URL=your_supabase_url
|
|
||||||
SUPABASE_KEY=your_supabase_key
|
|
||||||
OLLAMA_URL=http://localhost:11434
|
|
||||||
OLLAMA_MODEL=llama2
|
|
||||||
SEARXNG_URL=http://localhost:4000
|
|
||||||
SEARXNG_INSTANCES=["http://localhost:4000"]
|
|
||||||
MAX_RESULTS_PER_QUERY=50
|
|
||||||
CACHE_DURATION_HOURS=24
|
|
||||||
CACHE_DURATION_DAYS=7
|
|
29
.github/workflows/ci.yml
vendored
29
.github/workflows/ci.yml
vendored
@@ -1,29 +0,0 @@
|
|||||||
---
|
|
||||||
name: CI
|
|
||||||
|
|
||||||
on:
|
|
||||||
push:
|
|
||||||
branches: [ main ]
|
|
||||||
pull_request:
|
|
||||||
branches: [ main ]
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
test:
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
|
|
||||||
steps:
|
|
||||||
- uses: actions/checkout@v2
|
|
||||||
|
|
||||||
- name: Setup Node.js
|
|
||||||
uses: actions/setup-node@v2
|
|
||||||
with:
|
|
||||||
node-version: '18'
|
|
||||||
|
|
||||||
- name: Install dependencies
|
|
||||||
run: npm ci
|
|
||||||
|
|
||||||
- name: Run tests
|
|
||||||
run: npm test
|
|
||||||
|
|
||||||
- name: Run type check
|
|
||||||
run: npm run build
|
|
53
.gitignore
vendored
53
.gitignore
vendored
@@ -1,32 +1,43 @@
|
|||||||
# Environment variables
|
# Node.js
|
||||||
.env
|
|
||||||
.env.*
|
|
||||||
!.env.example
|
|
||||||
|
|
||||||
# Dependencies
|
|
||||||
node_modules/
|
node_modules/
|
||||||
yarn-error.log
|
|
||||||
npm-debug.log
|
npm-debug.log
|
||||||
|
yarn-error.log
|
||||||
|
package-lock.json
|
||||||
|
|
||||||
# Build outputs
|
# Build output
|
||||||
dist/
|
/.next/
|
||||||
build/
|
/out/
|
||||||
.next/
|
/dist/
|
||||||
|
|
||||||
# IDE/Editor
|
# IDE/Editor specific
|
||||||
.vscode/
|
.vscode/
|
||||||
.idea/
|
.idea/
|
||||||
*.swp
|
*.iml
|
||||||
*.swo
|
|
||||||
|
|
||||||
# OS
|
# Environment variables
|
||||||
.DS_Store
|
.env
|
||||||
Thumbs.db
|
.env.local
|
||||||
|
.env.development.local
|
||||||
|
.env.test.local
|
||||||
|
.env.production.local
|
||||||
|
|
||||||
# Logs
|
# Config files
|
||||||
|
config.toml
|
||||||
|
|
||||||
|
# Log files
|
||||||
logs/
|
logs/
|
||||||
*.log
|
*.log
|
||||||
|
|
||||||
# Cache
|
# Testing
|
||||||
.cache/
|
/coverage/
|
||||||
.npm/
|
|
||||||
|
# Miscellaneous
|
||||||
|
.DS_Store
|
||||||
|
Thumbs.db
|
||||||
|
|
||||||
|
# Db
|
||||||
|
db.sqlite
|
||||||
|
/searxng
|
||||||
|
|
||||||
|
# Dev
|
||||||
|
docker-compose-dev.yaml
|
||||||
|
244
README.md
244
README.md
@@ -1,120 +1,178 @@
|
|||||||
# BizSearch
|
# 🚀 Perplexica - An AI-powered search engine 🔎 <!-- omit in toc -->
|
||||||
|
|
||||||
A tool for finding and analyzing local businesses using AI-powered data extraction.
|
[](https://discord.gg/26aArMy8tT)
|
||||||
|
|
||||||
## Prerequisites
|

|
||||||
|
|
||||||
- Node.js 16+
|
## Table of Contents <!-- omit in toc -->
|
||||||
- Ollama (for local LLM)
|
|
||||||
- SearxNG instance
|
|
||||||
|
|
||||||
## Installation
|
- [Overview](#overview)
|
||||||
|
- [Preview](#preview)
|
||||||
|
- [Features](#features)
|
||||||
|
- [Installation](#installation)
|
||||||
|
- [Getting Started with Docker (Recommended)](#getting-started-with-docker-recommended)
|
||||||
|
- [Non-Docker Installation](#non-docker-installation)
|
||||||
|
- [Ollama Connection Errors](#ollama-connection-errors)
|
||||||
|
- [Using as a Search Engine](#using-as-a-search-engine)
|
||||||
|
- [Using Perplexica's API](#using-perplexicas-api)
|
||||||
|
- [Expose Perplexica to a network](#expose-perplexica-to-network)
|
||||||
|
- [One-Click Deployment](#one-click-deployment)
|
||||||
|
- [Upcoming Features](#upcoming-features)
|
||||||
|
- [Support Us](#support-us)
|
||||||
|
- [Donations](#donations)
|
||||||
|
- [Contribution](#contribution)
|
||||||
|
- [Help and Support](#help-and-support)
|
||||||
|
|
||||||
1. Install Ollama:
|
## Overview
|
||||||
```bash
|
|
||||||
# On macOS
|
|
||||||
brew install ollama
|
|
||||||
```
|
|
||||||
|
|
||||||
2. Start Ollama:
|
Perplexica is an open-source AI-powered searching tool or an AI-powered search engine that goes deep into the internet to find answers. Inspired by Perplexity AI, it's an open-source option that not just searches the web but understands your questions. It uses advanced machine learning algorithms like similarity searching and embeddings to refine results and provides clear answers with sources cited.
|
||||||
```bash
|
|
||||||
# Start and enable on login
|
|
||||||
brew services start ollama
|
|
||||||
|
|
||||||
# Or run without auto-start
|
Using SearxNG to stay current and fully open source, Perplexica ensures you always get the most up-to-date information without compromising your privacy.
|
||||||
/usr/local/opt/ollama/bin/ollama serve
|
|
||||||
```
|
|
||||||
|
|
||||||
3. Pull the required model:
|
Want to know more about its architecture and how it works? You can read it [here](https://github.com/ItzCrazyKns/Perplexica/tree/master/docs/architecture/README.md).
|
||||||
```bash
|
|
||||||
ollama pull mistral
|
|
||||||
```
|
|
||||||
|
|
||||||
4. Clone and set up the project:
|
## Preview
|
||||||
```bash
|
|
||||||
git clone https://github.com/yourusername/bizsearch.git
|
|
||||||
cd bizsearch
|
|
||||||
npm install
|
|
||||||
```
|
|
||||||
|
|
||||||
5. Configure environment:
|

|
||||||
```bash
|
|
||||||
cp .env.example .env
|
|
||||||
# Edit .env with your settings
|
|
||||||
```
|
|
||||||
|
|
||||||
6. Start the application:
|
|
||||||
```bash
|
|
||||||
npm run dev
|
|
||||||
```
|
|
||||||
|
|
||||||
7. Open http://localhost:3000 in your browser
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
If Ollama fails to start:
|
|
||||||
```bash
|
|
||||||
# Stop any existing instance
|
|
||||||
brew services stop ollama
|
|
||||||
# Wait a few seconds
|
|
||||||
sleep 5
|
|
||||||
# Start again
|
|
||||||
brew services start ollama
|
|
||||||
```
|
|
||||||
|
|
||||||
To verify Ollama is running:
|
|
||||||
```bash
|
|
||||||
curl http://localhost:11434/api/version
|
|
||||||
```
|
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
||||||
- Business search with location filtering
|
- **Local LLMs**: You can make use local LLMs such as Llama3 and Mixtral using Ollama.
|
||||||
- Contact information extraction
|
- **Two Main Modes:**
|
||||||
- AI-powered data validation
|
- **Copilot Mode:** (In development) Boosts search by generating different queries to find more relevant internet sources. Like normal search instead of just using the context by SearxNG, it visits the top matches and tries to find relevant sources to the user's query directly from the page.
|
||||||
- Clean, user-friendly interface
|
- **Normal Mode:** Processes your query and performs a web search.
|
||||||
- Service health monitoring
|
- **Focus Modes:** Special modes to better answer specific types of questions. Perplexica currently has 6 focus modes:
|
||||||
|
- **All Mode:** Searches the entire web to find the best results.
|
||||||
|
- **Writing Assistant Mode:** Helpful for writing tasks that do not require searching the web.
|
||||||
|
- **Academic Search Mode:** Finds articles and papers, ideal for academic research.
|
||||||
|
- **YouTube Search Mode:** Finds YouTube videos based on the search query.
|
||||||
|
- **Wolfram Alpha Search Mode:** Answers queries that need calculations or data analysis using Wolfram Alpha.
|
||||||
|
- **Reddit Search Mode:** Searches Reddit for discussions and opinions related to the query.
|
||||||
|
- **Current Information:** Some search tools might give you outdated info because they use data from crawling bots and convert them into embeddings and store them in a index. Unlike them, Perplexica uses SearxNG, a metasearch engine to get the results and rerank and get the most relevant source out of it, ensuring you always get the latest information without the overhead of daily data updates.
|
||||||
|
- **API**: Integrate Perplexica into your existing applications and make use of its capibilities.
|
||||||
|
|
||||||
## Configuration
|
It has many more features like image and video search. Some of the planned features are mentioned in [upcoming features](#upcoming-features).
|
||||||
|
|
||||||
Key environment variables:
|
## Installation
|
||||||
- `SEARXNG_URL`: Your SearxNG instance URL
|
|
||||||
- `OLLAMA_URL`: Ollama API endpoint (default: http://localhost:11434)
|
|
||||||
- `SUPABASE_URL`: Your Supabase project URL
|
|
||||||
- `SUPABASE_ANON_KEY`: Your Supabase anonymous key
|
|
||||||
- `CACHE_DURATION_DAYS`: How long to cache results (default: 7)
|
|
||||||
|
|
||||||
## Supabase Setup
|
There are mainly 2 ways of installing Perplexica - With Docker, Without Docker. Using Docker is highly recommended.
|
||||||
|
|
||||||
1. Create a new Supabase project
|
### Getting Started with Docker (Recommended)
|
||||||
2. Run the SQL commands in `db/init.sql` to create the cache table
|
|
||||||
3. Copy your project URL and anon key to `.env`
|
|
||||||
|
|
||||||
## License
|
1. Ensure Docker is installed and running on your system.
|
||||||
|
2. Clone the Perplexica repository:
|
||||||
|
|
||||||
MIT
|
```bash
|
||||||
|
git clone https://github.com/ItzCrazyKns/Perplexica.git
|
||||||
|
```
|
||||||
|
|
||||||
## Cache Management
|
3. After cloning, navigate to the directory containing the project files.
|
||||||
|
|
||||||
The application uses Supabase for caching search results. Cache entries expire after 7 days.
|
4. Rename the `sample.config.toml` file to `config.toml`. For Docker setups, you need only fill in the following fields:
|
||||||
|
|
||||||
### Manual Cache Cleanup
|
- `OPENAI`: Your OpenAI API key. **You only need to fill this if you wish to use OpenAI's models**.
|
||||||
|
- `OLLAMA`: Your Ollama API URL. You should enter it as `http://host.docker.internal:PORT_NUMBER`. If you installed Ollama on port 11434, use `http://host.docker.internal:11434`. For other ports, adjust accordingly. **You need to fill this if you wish to use Ollama's models instead of OpenAI's**.
|
||||||
|
- `GROQ`: Your Groq API key. **You only need to fill this if you wish to use Groq's hosted models**.
|
||||||
|
- `ANTHROPIC`: Your Anthropic API key. **You only need to fill this if you wish to use Anthropic models**.
|
||||||
|
|
||||||
If automatic cleanup is not available, you can manually clean up expired entries:
|
**Note**: You can change these after starting Perplexica from the settings dialog.
|
||||||
|
|
||||||
1. Using the API:
|
- `SIMILARITY_MEASURE`: The similarity measure to use (This is filled by default; you can leave it as is if you are unsure about it.)
|
||||||
```bash
|
|
||||||
curl -X POST http://localhost:3000/api/cleanup
|
|
||||||
```
|
|
||||||
|
|
||||||
2. Using SQL:
|
5. Ensure you are in the directory containing the `docker-compose.yaml` file and execute:
|
||||||
```sql
|
|
||||||
select manual_cleanup();
|
|
||||||
```
|
|
||||||
|
|
||||||
### Cache Statistics
|
```bash
|
||||||
|
docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
View cache statistics using:
|
6. Wait a few minutes for the setup to complete. You can access Perplexica at http://localhost:3000 in your web browser.
|
||||||
```sql
|
|
||||||
select * from cache_stats;
|
**Note**: After the containers are built, you can start Perplexica directly from Docker without having to open a terminal.
|
||||||
```
|
|
||||||
|
### Non-Docker Installation
|
||||||
|
|
||||||
|
1. Install SearXNG and allow `JSON` format in the SearXNG settings.
|
||||||
|
2. Clone the repository and rename the `sample.config.toml` file to `config.toml` in the root directory. Ensure you complete all required fields in this file.
|
||||||
|
3. Rename the `.env.example` file to `.env` in the `ui` folder and fill in all necessary fields.
|
||||||
|
4. After populating the configuration and environment files, run `npm i` in both the `ui` folder and the root directory.
|
||||||
|
5. Install the dependencies and then execute `npm run build` in both the `ui` folder and the root directory.
|
||||||
|
6. Finally, start both the frontend and the backend by running `npm run start` in both the `ui` folder and the root directory.
|
||||||
|
|
||||||
|
**Note**: Using Docker is recommended as it simplifies the setup process, especially for managing environment variables and dependencies.
|
||||||
|
|
||||||
|
See the [installation documentation](https://github.com/ItzCrazyKns/Perplexica/tree/master/docs/installation) for more information like exposing it your network, etc.
|
||||||
|
|
||||||
|
### Ollama Connection Errors
|
||||||
|
|
||||||
|
If you're encountering an Ollama connection error, it is likely due to the backend being unable to connect to Ollama's API. To fix this issue you can:
|
||||||
|
|
||||||
|
1. **Check your Ollama API URL:** Ensure that the API URL is correctly set in the settings menu.
|
||||||
|
2. **Update API URL Based on OS:**
|
||||||
|
|
||||||
|
- **Windows:** Use `http://host.docker.internal:11434`
|
||||||
|
- **Mac:** Use `http://host.docker.internal:11434`
|
||||||
|
- **Linux:** Use `http://<private_ip_of_host>:11434`
|
||||||
|
|
||||||
|
Adjust the port number if you're using a different one.
|
||||||
|
|
||||||
|
3. **Linux Users - Expose Ollama to Network:**
|
||||||
|
|
||||||
|
- Inside `/etc/systemd/system/ollama.service`, you need to add `Environment="OLLAMA_HOST=0.0.0.0"`. Then restart Ollama by `systemctl restart ollama`. For more information see [Ollama docs](https://github.com/ollama/ollama/blob/main/docs/faq.md#setting-environment-variables-on-linux)
|
||||||
|
|
||||||
|
- Ensure that the port (default is 11434) is not blocked by your firewall.
|
||||||
|
|
||||||
|
## Using as a Search Engine
|
||||||
|
|
||||||
|
If you wish to use Perplexica as an alternative to traditional search engines like Google or Bing, or if you want to add a shortcut for quick access from your browser's search bar, follow these steps:
|
||||||
|
|
||||||
|
1. Open your browser's settings.
|
||||||
|
2. Navigate to the 'Search Engines' section.
|
||||||
|
3. Add a new site search with the following URL: `http://localhost:3000/?q=%s`. Replace `localhost` with your IP address or domain name, and `3000` with the port number if Perplexica is not hosted locally.
|
||||||
|
4. Click the add button. Now, you can use Perplexica directly from your browser's search bar.
|
||||||
|
|
||||||
|
## Using Perplexica's API
|
||||||
|
|
||||||
|
Perplexica also provides an API for developers looking to integrate its powerful search engine into their own applications. You can run searches, use multiple models and get answers to your queries.
|
||||||
|
|
||||||
|
For more details, check out the full documentation [here](https://github.com/ItzCrazyKns/Perplexica/tree/master/docs/API/SEARCH.md).
|
||||||
|
|
||||||
|
## Expose Perplexica to network
|
||||||
|
|
||||||
|
You can access Perplexica over your home network by following our networking guide [here](https://github.com/ItzCrazyKns/Perplexica/blob/master/docs/installation/NETWORKING.md).
|
||||||
|
|
||||||
|
## One-Click Deployment
|
||||||
|
|
||||||
|
[](https://usw.sealos.io/?openapp=system-template%3FtemplateName%3Dperplexica)
|
||||||
|
[](https://repocloud.io/details/?app_id=267)
|
||||||
|
|
||||||
|
## Upcoming Features
|
||||||
|
|
||||||
|
- [x] Add settings page
|
||||||
|
- [x] Adding support for local LLMs
|
||||||
|
- [x] History Saving features
|
||||||
|
- [x] Introducing various Focus Modes
|
||||||
|
- [x] Adding API support
|
||||||
|
- [x] Adding Discover
|
||||||
|
- [ ] Finalizing Copilot Mode
|
||||||
|
|
||||||
|
## Support Us
|
||||||
|
|
||||||
|
If you find Perplexica useful, consider giving us a star on GitHub. This helps more people discover Perplexica and supports the development of new features. Your support is greatly appreciated.
|
||||||
|
|
||||||
|
### Donations
|
||||||
|
|
||||||
|
We also accept donations to help sustain our project. If you would like to contribute, you can use the following options to donate. Thank you for your support!
|
||||||
|
|
||||||
|
| Ethereum |
|
||||||
|
| ----------------------------------------------------- |
|
||||||
|
| Address: `0xB025a84b2F269570Eb8D4b05DEdaA41D8525B6DD` |
|
||||||
|
|
||||||
|
## Contribution
|
||||||
|
|
||||||
|
Perplexica is built on the idea that AI and large language models should be easy for everyone to use. If you find bugs or have ideas, please share them in via GitHub Issues. For more information on contributing to Perplexica you can read the [CONTRIBUTING.md](CONTRIBUTING.md) file to learn more about Perplexica and how you can contribute to it.
|
||||||
|
|
||||||
|
## Help and Support
|
||||||
|
|
||||||
|
If you have any questions or feedback, please feel free to reach out to us. You can create an issue on GitHub or join our Discord server. There, you can connect with other users, share your experiences and reviews, and receive more personalized help. [Click here](https://discord.gg/EFwsmQDgAu) to join the Discord server. To discuss matters outside of regular support, feel free to contact me on Discord at `itzcrazykns`.
|
||||||
|
|
||||||
|
Thank you for exploring Perplexica, the AI-powered search engine designed to enhance your search experience. We are constantly working to improve Perplexica and expand its capabilities. We value your feedback and contributions which help us make Perplexica even better. Don't forget to check back for updates and new features!
|
||||||
|
14
config.toml
14
config.toml
@@ -1,14 +0,0 @@
|
|||||||
[GENERAL]
|
|
||||||
PORT = 3001 # Port to run the server on
|
|
||||||
SIMILARITY_MEASURE = "cosine" # "cosine" or "dot"
|
|
||||||
KEEP_ALIVE = "5m" # How long to keep Ollama models loaded into memory. (Instead of using -1 use "-1m")
|
|
||||||
|
|
||||||
[API_KEYS]
|
|
||||||
OPENAI = "" # OpenAI API key - sk-1234567890abcdef1234567890abcdef
|
|
||||||
GROQ = "" # Groq API key - gsk_1234567890abcdef1234567890abcdef
|
|
||||||
ANTHROPIC = "" # Anthropic API key - sk-ant-1234567890abcdef1234567890abcdef
|
|
||||||
GEMINI = "" # Gemini API key - sk-1234567890abcdef1234567890abcdef
|
|
||||||
|
|
||||||
[API_ENDPOINTS]
|
|
||||||
SEARXNG = "http://localhost:32768" # SearxNG API URL
|
|
||||||
OLLAMA = "" # Ollama API URL - http://host.docker.internal:11434
|
|
189
db/init.sql
189
db/init.sql
@@ -1,189 +0,0 @@
|
|||||||
-- Enable required extensions
|
|
||||||
create extension if not exists "uuid-ossp"; -- For UUID generation
|
|
||||||
create extension if not exists pg_cron; -- For scheduled jobs
|
|
||||||
|
|
||||||
-- Create the search_cache table
|
|
||||||
create table public.search_cache (
|
|
||||||
id uuid default uuid_generate_v4() primary key,
|
|
||||||
query text not null,
|
|
||||||
results jsonb not null,
|
|
||||||
location text not null,
|
|
||||||
category text not null,
|
|
||||||
created_at timestamp with time zone default timezone('utc'::text, now()) not null,
|
|
||||||
updated_at timestamp with time zone default timezone('utc'::text, now()) not null,
|
|
||||||
expires_at timestamp with time zone default timezone('utc'::text, now() + interval '7 days') not null
|
|
||||||
);
|
|
||||||
|
|
||||||
-- Create indexes
|
|
||||||
create index search_cache_query_idx on public.search_cache (query);
|
|
||||||
create index search_cache_location_category_idx on public.search_cache (location, category);
|
|
||||||
create index search_cache_expires_at_idx on public.search_cache (expires_at);
|
|
||||||
|
|
||||||
-- Enable RLS
|
|
||||||
alter table public.search_cache enable row level security;
|
|
||||||
|
|
||||||
-- Create policies
|
|
||||||
create policy "Allow public read access"
|
|
||||||
on public.search_cache for select
|
|
||||||
using (true);
|
|
||||||
|
|
||||||
create policy "Allow service write access"
|
|
||||||
on public.search_cache for insert
|
|
||||||
with check (true);
|
|
||||||
|
|
||||||
create policy "Allow service update access"
|
|
||||||
on public.search_cache for update
|
|
||||||
using (true)
|
|
||||||
with check (true);
|
|
||||||
|
|
||||||
create policy "Allow delete expired records"
|
|
||||||
on public.search_cache for delete
|
|
||||||
using (expires_at < now());
|
|
||||||
|
|
||||||
-- Create function to clean up expired records
|
|
||||||
create or replace function cleanup_expired_cache()
|
|
||||||
returns void
|
|
||||||
language plpgsql
|
|
||||||
security definer
|
|
||||||
as $$
|
|
||||||
begin
|
|
||||||
delete from public.search_cache
|
|
||||||
where expires_at < now();
|
|
||||||
end;
|
|
||||||
$$;
|
|
||||||
|
|
||||||
-- Create a manual cleanup function since pg_cron might not be available
|
|
||||||
create or replace function manual_cleanup()
|
|
||||||
returns void
|
|
||||||
language plpgsql
|
|
||||||
security definer
|
|
||||||
as $$
|
|
||||||
begin
|
|
||||||
delete from public.search_cache
|
|
||||||
where expires_at < now();
|
|
||||||
end;
|
|
||||||
$$;
|
|
||||||
|
|
||||||
-- Create a view for cache statistics
|
|
||||||
create or replace view cache_stats as
|
|
||||||
select
|
|
||||||
count(*) as total_entries,
|
|
||||||
count(*) filter (where expires_at < now()) as expired_entries,
|
|
||||||
count(*) filter (where expires_at >= now()) as valid_entries,
|
|
||||||
min(created_at) as oldest_entry,
|
|
||||||
max(created_at) as newest_entry,
|
|
||||||
count(distinct category) as unique_categories,
|
|
||||||
count(distinct location) as unique_locations
|
|
||||||
from public.search_cache;
|
|
||||||
|
|
||||||
-- Grant permissions to access the view
|
|
||||||
grant select on cache_stats to postgres;
|
|
||||||
|
|
||||||
-- Create table if not exists businesses
|
|
||||||
create table if not exists businesses (
|
|
||||||
id text primary key,
|
|
||||||
name text not null,
|
|
||||||
phone text,
|
|
||||||
email text,
|
|
||||||
address text,
|
|
||||||
rating numeric,
|
|
||||||
website text,
|
|
||||||
logo text,
|
|
||||||
source text,
|
|
||||||
description text,
|
|
||||||
latitude numeric,
|
|
||||||
longitude numeric,
|
|
||||||
last_updated timestamp with time zone default timezone('utc'::text, now()),
|
|
||||||
search_count integer default 1,
|
|
||||||
created_at timestamp with time zone default timezone('utc'::text, now())
|
|
||||||
);
|
|
||||||
|
|
||||||
-- Create indexes for common queries
|
|
||||||
create index if not exists businesses_name_idx on businesses (name);
|
|
||||||
create index if not exists businesses_rating_idx on businesses (rating desc);
|
|
||||||
create index if not exists businesses_search_count_idx on businesses (search_count desc);
|
|
||||||
create index if not exists businesses_last_updated_idx on businesses (last_updated desc);
|
|
||||||
|
|
||||||
-- Create tables if they don't exist
|
|
||||||
CREATE TABLE IF NOT EXISTS businesses (
|
|
||||||
id TEXT PRIMARY KEY,
|
|
||||||
name TEXT NOT NULL,
|
|
||||||
phone TEXT,
|
|
||||||
email TEXT,
|
|
||||||
address TEXT,
|
|
||||||
rating INTEGER,
|
|
||||||
website TEXT,
|
|
||||||
logo TEXT,
|
|
||||||
source TEXT,
|
|
||||||
description TEXT,
|
|
||||||
location JSONB,
|
|
||||||
place_id TEXT,
|
|
||||||
photos TEXT[],
|
|
||||||
opening_hours TEXT[],
|
|
||||||
distance JSONB,
|
|
||||||
last_updated TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
|
||||||
search_count INTEGER DEFAULT 0
|
|
||||||
);
|
|
||||||
|
|
||||||
CREATE TABLE IF NOT EXISTS searches (
|
|
||||||
id SERIAL PRIMARY KEY,
|
|
||||||
query TEXT NOT NULL,
|
|
||||||
location TEXT NOT NULL,
|
|
||||||
timestamp TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
|
||||||
results_count INTEGER
|
|
||||||
);
|
|
||||||
|
|
||||||
CREATE TABLE IF NOT EXISTS cache (
|
|
||||||
key TEXT PRIMARY KEY,
|
|
||||||
value JSONB NOT NULL,
|
|
||||||
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
|
||||||
expires_at TIMESTAMP WITH TIME ZONE NOT NULL
|
|
||||||
);
|
|
||||||
|
|
||||||
-- Create indexes
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_businesses_location ON businesses USING GIN (location);
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_businesses_search ON businesses USING GIN (to_tsvector('english', name || ' ' || COALESCE(description, '')));
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_cache_expires ON cache (expires_at);
|
|
||||||
|
|
||||||
-- Set up RLS (Row Level Security)
|
|
||||||
ALTER TABLE businesses ENABLE ROW LEVEL SECURITY;
|
|
||||||
ALTER TABLE searches ENABLE ROW LEVEL SECURITY;
|
|
||||||
ALTER TABLE cache ENABLE ROW LEVEL SECURITY;
|
|
||||||
|
|
||||||
-- Create policies
|
|
||||||
CREATE POLICY "Allow anonymous select" ON businesses FOR SELECT USING (true);
|
|
||||||
CREATE POLICY "Allow service role insert" ON businesses FOR INSERT WITH CHECK (true);
|
|
||||||
CREATE POLICY "Allow service role update" ON businesses FOR UPDATE USING (true);
|
|
||||||
|
|
||||||
CREATE POLICY "Allow anonymous select" ON searches FOR SELECT USING (true);
|
|
||||||
CREATE POLICY "Allow service role insert" ON searches FOR INSERT WITH CHECK (true);
|
|
||||||
|
|
||||||
CREATE POLICY "Allow anonymous select" ON cache FOR SELECT USING (true);
|
|
||||||
CREATE POLICY "Allow service role all" ON cache USING (true);
|
|
||||||
|
|
||||||
-- Add place_id column to businesses table if it doesn't exist
|
|
||||||
ALTER TABLE businesses ADD COLUMN IF NOT EXISTS place_id TEXT;
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_businesses_place_id ON businesses(place_id);
|
|
||||||
|
|
||||||
-- Create a unique constraint on place_id (excluding nulls)
|
|
||||||
CREATE UNIQUE INDEX IF NOT EXISTS idx_businesses_place_id_unique
|
|
||||||
ON businesses(place_id)
|
|
||||||
WHERE place_id IS NOT NULL;
|
|
||||||
|
|
||||||
CREATE TABLE IF NOT EXISTS businesses (
|
|
||||||
id TEXT PRIMARY KEY,
|
|
||||||
name TEXT NOT NULL,
|
|
||||||
address TEXT NOT NULL,
|
|
||||||
phone TEXT NOT NULL,
|
|
||||||
description TEXT NOT NULL,
|
|
||||||
website TEXT,
|
|
||||||
source TEXT NOT NULL,
|
|
||||||
rating REAL,
|
|
||||||
lat REAL,
|
|
||||||
lng REAL,
|
|
||||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
||||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
|
||||||
);
|
|
||||||
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_businesses_source ON businesses(source);
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_businesses_rating ON businesses(rating);
|
|
@@ -1,44 +0,0 @@
|
|||||||
-- Create the businesses table
|
|
||||||
create table businesses (
|
|
||||||
id uuid primary key,
|
|
||||||
name text not null,
|
|
||||||
phone text,
|
|
||||||
address text,
|
|
||||||
city text,
|
|
||||||
state text,
|
|
||||||
zip text,
|
|
||||||
category text[],
|
|
||||||
rating numeric,
|
|
||||||
review_count integer,
|
|
||||||
license text,
|
|
||||||
services text[],
|
|
||||||
hours jsonb,
|
|
||||||
website text,
|
|
||||||
email text,
|
|
||||||
verified boolean default false,
|
|
||||||
last_updated timestamp with time zone,
|
|
||||||
search_query text,
|
|
||||||
search_location text,
|
|
||||||
search_timestamp timestamp with time zone,
|
|
||||||
reliability_score integer,
|
|
||||||
|
|
||||||
-- Create a composite index for deduplication
|
|
||||||
constraint unique_business unique (phone, address)
|
|
||||||
);
|
|
||||||
|
|
||||||
-- Create indexes for common queries
|
|
||||||
create index idx_business_location on businesses (city, state);
|
|
||||||
create index idx_business_category on businesses using gin (category);
|
|
||||||
create index idx_search_query on businesses using gin (search_query gin_trgm_ops);
|
|
||||||
create index idx_search_location on businesses using gin (search_location gin_trgm_ops);
|
|
||||||
create index idx_reliability on businesses (reliability_score);
|
|
||||||
|
|
||||||
-- Enable full text search
|
|
||||||
alter table businesses add column search_vector tsvector
|
|
||||||
generated always as (
|
|
||||||
setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
|
|
||||||
setweight(to_tsvector('english', coalesce(search_query, '')), 'B') ||
|
|
||||||
setweight(to_tsvector('english', coalesce(search_location, '')), 'C')
|
|
||||||
) stored;
|
|
||||||
|
|
||||||
create index idx_business_search on businesses using gin(search_vector);
|
|
@@ -1,15 +0,0 @@
|
|||||||
-- Check if table exists
|
|
||||||
SELECT EXISTS (
|
|
||||||
SELECT FROM information_schema.tables
|
|
||||||
WHERE table_schema = 'public'
|
|
||||||
AND table_name = 'businesses'
|
|
||||||
);
|
|
||||||
|
|
||||||
-- Check table structure
|
|
||||||
SELECT column_name, data_type, is_nullable
|
|
||||||
FROM information_schema.columns
|
|
||||||
WHERE table_schema = 'public'
|
|
||||||
AND table_name = 'businesses';
|
|
||||||
|
|
||||||
-- Check row count
|
|
||||||
SELECT COUNT(*) as count FROM businesses;
|
|
@@ -4,7 +4,7 @@ services:
|
|||||||
volumes:
|
volumes:
|
||||||
- ./searxng:/etc/searxng:rw
|
- ./searxng:/etc/searxng:rw
|
||||||
ports:
|
ports:
|
||||||
- 4000:8080
|
- '4000:8080'
|
||||||
networks:
|
networks:
|
||||||
- perplexica-network
|
- perplexica-network
|
||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
@@ -19,7 +19,7 @@ services:
|
|||||||
depends_on:
|
depends_on:
|
||||||
- searxng
|
- searxng
|
||||||
ports:
|
ports:
|
||||||
- 3001:3001
|
- '3001:3001'
|
||||||
volumes:
|
volumes:
|
||||||
- backend-dbstore:/home/perplexica/data
|
- backend-dbstore:/home/perplexica/data
|
||||||
- uploads:/home/perplexica/uploads
|
- uploads:/home/perplexica/uploads
|
||||||
@@ -41,7 +41,7 @@ services:
|
|||||||
depends_on:
|
depends_on:
|
||||||
- perplexica-backend
|
- perplexica-backend
|
||||||
ports:
|
ports:
|
||||||
- 3000:3000
|
- '3000:3000'
|
||||||
networks:
|
networks:
|
||||||
- perplexica-network
|
- perplexica-network
|
||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
|
@@ -1,26 +0,0 @@
|
|||||||
version: '3'
|
|
||||||
services:
|
|
||||||
searxng:
|
|
||||||
image: searxng/searxng
|
|
||||||
ports:
|
|
||||||
- "4000:8080"
|
|
||||||
volumes:
|
|
||||||
- ./searxng:/etc/searxng
|
|
||||||
environment:
|
|
||||||
- INSTANCE_NAME=perplexica-searxng
|
|
||||||
- BASE_URL=http://localhost:4000/
|
|
||||||
- SEARXNG_SECRET=your_secret_key_here
|
|
||||||
restart: unless-stopped
|
|
||||||
|
|
||||||
app:
|
|
||||||
build:
|
|
||||||
context: .
|
|
||||||
dockerfile: backend.dockerfile
|
|
||||||
ports:
|
|
||||||
- "3000:3000"
|
|
||||||
environment:
|
|
||||||
- SEARXNG_URL=http://searxng:8080
|
|
||||||
volumes:
|
|
||||||
- ./config.toml:/home/perplexica/config.toml
|
|
||||||
depends_on:
|
|
||||||
- searxng
|
|
@@ -1,108 +0,0 @@
|
|||||||
# Ethical Web Scraping Guidelines
|
|
||||||
|
|
||||||
## Core Principles
|
|
||||||
|
|
||||||
1. **Respect Robots.txt**
|
|
||||||
- Always check and honor robots.txt directives
|
|
||||||
- Cache robots.txt to reduce server load
|
|
||||||
- Default to conservative behavior when uncertain
|
|
||||||
|
|
||||||
2. **Proper Identification**
|
|
||||||
- Use clear, identifiable User-Agent strings
|
|
||||||
- Provide contact information
|
|
||||||
- Be transparent about your purpose
|
|
||||||
|
|
||||||
3. **Rate Limiting**
|
|
||||||
- Implement conservative rate limits
|
|
||||||
- Use exponential backoff for errors
|
|
||||||
- Distribute requests over time
|
|
||||||
|
|
||||||
4. **Data Usage**
|
|
||||||
- Only collect publicly available business information
|
|
||||||
- Respect privacy and data protection laws
|
|
||||||
- Provide clear opt-out mechanisms
|
|
||||||
- Keep data accurate and up-to-date
|
|
||||||
|
|
||||||
5. **Technical Considerations**
|
|
||||||
- Cache results to minimize requests
|
|
||||||
- Handle errors gracefully
|
|
||||||
- Monitor and log access patterns
|
|
||||||
- Use structured data when available
|
|
||||||
|
|
||||||
## Implementation
|
|
||||||
|
|
||||||
1. **Request Headers**
|
|
||||||
```typescript
|
|
||||||
const headers = {
|
|
||||||
'User-Agent': 'BizSearch/1.0 (+https://bizsearch.com/about)',
|
|
||||||
'Accept': 'text/html,application/xhtml+xml',
|
|
||||||
'From': 'contact@bizsearch.com'
|
|
||||||
};
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **Rate Limiting**
|
|
||||||
```typescript
|
|
||||||
const rateLimits = {
|
|
||||||
requestsPerMinute: 10,
|
|
||||||
requestsPerHour: 100,
|
|
||||||
requestsPerDomain: 20
|
|
||||||
};
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **Caching**
|
|
||||||
```typescript
|
|
||||||
const cacheSettings = {
|
|
||||||
ttl: 24 * 60 * 60, // 24 hours
|
|
||||||
maxSize: 1000 // entries
|
|
||||||
};
|
|
||||||
```
|
|
||||||
|
|
||||||
## Opt-Out Process
|
|
||||||
|
|
||||||
1. Business owners can opt-out by:
|
|
||||||
- Submitting a form on our website
|
|
||||||
- Emailing opt-out@bizsearch.com
|
|
||||||
- Adding a meta tag: `<meta name="bizsearch" content="noindex">`
|
|
||||||
|
|
||||||
2. We honor opt-outs within:
|
|
||||||
- 24 hours for direct requests
|
|
||||||
- 72 hours for cached data
|
|
||||||
|
|
||||||
## Legal Compliance
|
|
||||||
|
|
||||||
1. **Data Protection**
|
|
||||||
- GDPR compliance for EU businesses
|
|
||||||
- CCPA compliance for California businesses
|
|
||||||
- Regular data audits and cleanup
|
|
||||||
|
|
||||||
2. **Attribution**
|
|
||||||
- Clear source attribution
|
|
||||||
- Last-updated timestamps
|
|
||||||
- Data accuracy disclaimers
|
|
||||||
|
|
||||||
## Best Practices
|
|
||||||
|
|
||||||
1. **Before Scraping**
|
|
||||||
- Check robots.txt
|
|
||||||
- Verify site status
|
|
||||||
- Review terms of service
|
|
||||||
- Look for API alternatives
|
|
||||||
|
|
||||||
2. **During Scraping**
|
|
||||||
- Monitor response codes
|
|
||||||
- Respect server hints
|
|
||||||
- Implement backoff strategies
|
|
||||||
- Log access patterns
|
|
||||||
|
|
||||||
3. **After Scraping**
|
|
||||||
- Verify data accuracy
|
|
||||||
- Update cache entries
|
|
||||||
- Clean up old data
|
|
||||||
- Monitor opt-out requests
|
|
||||||
|
|
||||||
## Contact
|
|
||||||
|
|
||||||
For questions or concerns about our scraping practices:
|
|
||||||
- Email: ethics@bizsearch.com
|
|
||||||
- Phone: (555) 123-4567
|
|
||||||
- Web: https://bizsearch.com/ethics
|
|
@@ -7,34 +7,43 @@ To update Perplexica to the latest version, follow these steps:
|
|||||||
1. Clone the latest version of Perplexica from GitHub:
|
1. Clone the latest version of Perplexica from GitHub:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/ItzCrazyKns/Perplexica.git
|
git clone https://github.com/ItzCrazyKns/Perplexica.git
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Navigate to the Project Directory.
|
2. Navigate to the project directory.
|
||||||
|
|
||||||
3. Pull latest images from registry.
|
3. Check for changes in the configuration files. If the `sample.config.toml` file contains new fields, delete your existing `config.toml` file, rename `sample.config.toml` to `config.toml`, and update the configuration accordingly.
|
||||||
|
|
||||||
|
4. Pull the latest images from the registry.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker compose pull
|
docker compose pull
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Update and Recreate containers.
|
5. Update and recreate the containers.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker compose up -d
|
docker compose up -d
|
||||||
```
|
```
|
||||||
|
|
||||||
5. Once the command completes running go to http://localhost:3000 and verify the latest changes.
|
6. Once the command completes, go to http://localhost:3000 and verify the latest changes.
|
||||||
|
|
||||||
## For non Docker users
|
## For non-Docker users
|
||||||
|
|
||||||
1. Clone the latest version of Perplexica from GitHub:
|
1. Clone the latest version of Perplexica from GitHub:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/ItzCrazyKns/Perplexica.git
|
git clone https://github.com/ItzCrazyKns/Perplexica.git
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Navigate to the Project Directory
|
2. Navigate to the project directory.
|
||||||
3. Execute `npm i` in both the `ui` folder and the root directory.
|
|
||||||
4. Once packages are updated, execute `npm run build` in both the `ui` folder and the root directory.
|
3. Check for changes in the configuration files. If the `sample.config.toml` file contains new fields, delete your existing `config.toml` file, rename `sample.config.toml` to `config.toml`, and update the configuration accordingly.
|
||||||
5. Finally, start both the frontend and the backend by running `npm run start` in both the `ui` folder and the root directory.
|
|
||||||
|
4. Execute `npm i` in both the `ui` folder and the root directory.
|
||||||
|
|
||||||
|
5. Once the packages are updated, execute `npm run build` in both the `ui` folder and the root directory.
|
||||||
|
|
||||||
|
6. Finally, start both the frontend and the backend by running `npm run start` in both the `ui` folder and the root directory.
|
||||||
|
|
||||||
|
---
|
||||||
|
41
frontend/.gitignore
vendored
41
frontend/.gitignore
vendored
@@ -1,41 +0,0 @@
|
|||||||
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
|
|
||||||
|
|
||||||
# dependencies
|
|
||||||
/node_modules
|
|
||||||
/.pnp
|
|
||||||
.pnp.*
|
|
||||||
.yarn/*
|
|
||||||
!.yarn/patches
|
|
||||||
!.yarn/plugins
|
|
||||||
!.yarn/releases
|
|
||||||
!.yarn/versions
|
|
||||||
|
|
||||||
# testing
|
|
||||||
/coverage
|
|
||||||
|
|
||||||
# next.js
|
|
||||||
/.next/
|
|
||||||
/out/
|
|
||||||
|
|
||||||
# production
|
|
||||||
/build
|
|
||||||
|
|
||||||
# misc
|
|
||||||
.DS_Store
|
|
||||||
*.pem
|
|
||||||
|
|
||||||
# debug
|
|
||||||
npm-debug.log*
|
|
||||||
yarn-debug.log*
|
|
||||||
yarn-error.log*
|
|
||||||
.pnpm-debug.log*
|
|
||||||
|
|
||||||
# env files (can opt-in for committing if needed)
|
|
||||||
.env*
|
|
||||||
|
|
||||||
# vercel
|
|
||||||
.vercel
|
|
||||||
|
|
||||||
# typescript
|
|
||||||
*.tsbuildinfo
|
|
||||||
next-env.d.ts
|
|
@@ -1,36 +0,0 @@
|
|||||||
This is a [Next.js](https://nextjs.org) project bootstrapped with [`create-next-app`](https://nextjs.org/docs/app/api-reference/cli/create-next-app).
|
|
||||||
|
|
||||||
## Getting Started
|
|
||||||
|
|
||||||
First, run the development server:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm run dev
|
|
||||||
# or
|
|
||||||
yarn dev
|
|
||||||
# or
|
|
||||||
pnpm dev
|
|
||||||
# or
|
|
||||||
bun dev
|
|
||||||
```
|
|
||||||
|
|
||||||
Open [http://localhost:3000](http://localhost:3000) with your browser to see the result.
|
|
||||||
|
|
||||||
You can start editing the page by modifying `app/page.tsx`. The page auto-updates as you edit the file.
|
|
||||||
|
|
||||||
This project uses [`next/font`](https://nextjs.org/docs/app/building-your-application/optimizing/fonts) to automatically optimize and load [Geist](https://vercel.com/font), a new font family for Vercel.
|
|
||||||
|
|
||||||
## Learn More
|
|
||||||
|
|
||||||
To learn more about Next.js, take a look at the following resources:
|
|
||||||
|
|
||||||
- [Next.js Documentation](https://nextjs.org/docs) - learn about Next.js features and API.
|
|
||||||
- [Learn Next.js](https://nextjs.org/learn) - an interactive Next.js tutorial.
|
|
||||||
|
|
||||||
You can check out [the Next.js GitHub repository](https://github.com/vercel/next.js) - your feedback and contributions are welcome!
|
|
||||||
|
|
||||||
## Deploy on Vercel
|
|
||||||
|
|
||||||
The easiest way to deploy your Next.js app is to use the [Vercel Platform](https://vercel.com/new?utm_medium=default-template&filter=next.js&utm_source=create-next-app&utm_campaign=create-next-app-readme) from the creators of Next.js.
|
|
||||||
|
|
||||||
Check out our [Next.js deployment documentation](https://nextjs.org/docs/app/building-your-application/deploying) for more details.
|
|
@@ -1,16 +0,0 @@
|
|||||||
import { dirname } from "path";
|
|
||||||
import { fileURLToPath } from "url";
|
|
||||||
import { FlatCompat } from "@eslint/eslintrc";
|
|
||||||
|
|
||||||
const __filename = fileURLToPath(import.meta.url);
|
|
||||||
const __dirname = dirname(__filename);
|
|
||||||
|
|
||||||
const compat = new FlatCompat({
|
|
||||||
baseDirectory: __dirname,
|
|
||||||
});
|
|
||||||
|
|
||||||
const eslintConfig = [
|
|
||||||
...compat.extends("next/core-web-vitals", "next/typescript"),
|
|
||||||
];
|
|
||||||
|
|
||||||
export default eslintConfig;
|
|
@@ -1,13 +0,0 @@
|
|||||||
/** @type {import('next').NextConfig} */
|
|
||||||
const nextConfig = {
|
|
||||||
async rewrites() {
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
source: '/api/:path*',
|
|
||||||
destination: 'http://localhost:3000/api/:path*',
|
|
||||||
},
|
|
||||||
]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
module.exports = nextConfig
|
|
@@ -1,7 +0,0 @@
|
|||||||
import type { NextConfig } from "next";
|
|
||||||
|
|
||||||
const nextConfig: NextConfig = {
|
|
||||||
/* config options here */
|
|
||||||
};
|
|
||||||
|
|
||||||
export default nextConfig;
|
|
5848
frontend/package-lock.json
generated
5848
frontend/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@@ -1,33 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "frontend",
|
|
||||||
"version": "0.1.0",
|
|
||||||
"private": true,
|
|
||||||
"scripts": {
|
|
||||||
"dev": "next dev",
|
|
||||||
"build": "next build",
|
|
||||||
"start": "next start",
|
|
||||||
"lint": "next lint"
|
|
||||||
},
|
|
||||||
"dependencies": {
|
|
||||||
"@radix-ui/react-icons": "^1.3.2",
|
|
||||||
"class-variance-authority": "^0.7.1",
|
|
||||||
"clsx": "^2.1.1",
|
|
||||||
"lucide-react": "^0.469.0",
|
|
||||||
"next": "15.1.3",
|
|
||||||
"react": "^19.0.0",
|
|
||||||
"react-dom": "^19.0.0",
|
|
||||||
"tailwind-merge": "^2.6.0",
|
|
||||||
"tailwindcss-animate": "^1.0.7"
|
|
||||||
},
|
|
||||||
"devDependencies": {
|
|
||||||
"@eslint/eslintrc": "^3",
|
|
||||||
"@types/node": "^20",
|
|
||||||
"@types/react": "^19",
|
|
||||||
"@types/react-dom": "^19",
|
|
||||||
"eslint": "^9",
|
|
||||||
"eslint-config-next": "15.1.3",
|
|
||||||
"postcss": "^8",
|
|
||||||
"tailwindcss": "^3.4.1",
|
|
||||||
"typescript": "^5"
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,8 +0,0 @@
|
|||||||
/** @type {import('postcss-load-config').Config} */
|
|
||||||
const config = {
|
|
||||||
plugins: {
|
|
||||||
tailwindcss: {},
|
|
||||||
},
|
|
||||||
};
|
|
||||||
|
|
||||||
export default config;
|
|
@@ -1 +0,0 @@
|
|||||||
<svg fill="none" viewBox="0 0 16 16" xmlns="http://www.w3.org/2000/svg"><path d="M14.5 13.5V5.41a1 1 0 0 0-.3-.7L9.8.29A1 1 0 0 0 9.08 0H1.5v13.5A2.5 2.5 0 0 0 4 16h8a2.5 2.5 0 0 0 2.5-2.5m-1.5 0v-7H8v-5H3v12a1 1 0 0 0 1 1h8a1 1 0 0 0 1-1M9.5 5V2.12L12.38 5zM5.13 5h-.62v1.25h2.12V5zm-.62 3h7.12v1.25H4.5zm.62 3h-.62v1.25h7.12V11z" clip-rule="evenodd" fill="#666" fill-rule="evenodd"/></svg>
|
|
Before Width: | Height: | Size: 391 B |
@@ -1 +0,0 @@
|
|||||||
<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16"><g clip-path="url(#a)"><path fill-rule="evenodd" clip-rule="evenodd" d="M10.27 14.1a6.5 6.5 0 0 0 3.67-3.45q-1.24.21-2.7.34-.31 1.83-.97 3.1M8 16A8 8 0 1 0 8 0a8 8 0 0 0 0 16m.48-1.52a7 7 0 0 1-.96 0H7.5a4 4 0 0 1-.84-1.32q-.38-.89-.63-2.08a40 40 0 0 0 3.92 0q-.25 1.2-.63 2.08a4 4 0 0 1-.84 1.31zm2.94-4.76q1.66-.15 2.95-.43a7 7 0 0 0 0-2.58q-1.3-.27-2.95-.43a18 18 0 0 1 0 3.44m-1.27-3.54a17 17 0 0 1 0 3.64 39 39 0 0 1-4.3 0 17 17 0 0 1 0-3.64 39 39 0 0 1 4.3 0m1.1-1.17q1.45.13 2.69.34a6.5 6.5 0 0 0-3.67-3.44q.65 1.26.98 3.1M8.48 1.5l.01.02q.41.37.84 1.31.38.89.63 2.08a40 40 0 0 0-3.92 0q.25-1.2.63-2.08a4 4 0 0 1 .85-1.32 7 7 0 0 1 .96 0m-2.75.4a6.5 6.5 0 0 0-3.67 3.44 29 29 0 0 1 2.7-.34q.31-1.83.97-3.1M4.58 6.28q-1.66.16-2.95.43a7 7 0 0 0 0 2.58q1.3.27 2.95.43a18 18 0 0 1 0-3.44m.17 4.71q-1.45-.12-2.69-.34a6.5 6.5 0 0 0 3.67 3.44q-.65-1.27-.98-3.1" fill="#666"/></g><defs><clipPath id="a"><path fill="#fff" d="M0 0h16v16H0z"/></clipPath></defs></svg>
|
|
Before Width: | Height: | Size: 1.0 KiB |
@@ -1 +0,0 @@
|
|||||||
<svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 394 80"><path fill="#000" d="M262 0h68.5v12.7h-27.2v66.6h-13.6V12.7H262V0ZM149 0v12.7H94v20.4h44.3v12.6H94v21h55v12.6H80.5V0h68.7zm34.3 0h-17.8l63.8 79.4h17.9l-32-39.7 32-39.6h-17.9l-23 28.6-23-28.6zm18.3 56.7-9-11-27.1 33.7h17.8l18.3-22.7z"/><path fill="#000" d="M81 79.3 17 0H0v79.3h13.6V17l50.2 62.3H81Zm252.6-.4c-1 0-1.8-.4-2.5-1s-1.1-1.6-1.1-2.6.3-1.8 1-2.5 1.6-1 2.6-1 1.8.3 2.5 1a3.4 3.4 0 0 1 .6 4.3 3.7 3.7 0 0 1-3 1.8zm23.2-33.5h6v23.3c0 2.1-.4 4-1.3 5.5a9.1 9.1 0 0 1-3.8 3.5c-1.6.8-3.5 1.3-5.7 1.3-2 0-3.7-.4-5.3-1s-2.8-1.8-3.7-3.2c-.9-1.3-1.4-3-1.4-5h6c.1.8.3 1.6.7 2.2s1 1.2 1.6 1.5c.7.4 1.5.5 2.4.5 1 0 1.8-.2 2.4-.6a4 4 0 0 0 1.6-1.8c.3-.8.5-1.8.5-3V45.5zm30.9 9.1a4.4 4.4 0 0 0-2-3.3 7.5 7.5 0 0 0-4.3-1.1c-1.3 0-2.4.2-3.3.5-.9.4-1.6 1-2 1.6a3.5 3.5 0 0 0-.3 4c.3.5.7.9 1.3 1.2l1.8 1 2 .5 3.2.8c1.3.3 2.5.7 3.7 1.2a13 13 0 0 1 3.2 1.8 8.1 8.1 0 0 1 3 6.5c0 2-.5 3.7-1.5 5.1a10 10 0 0 1-4.4 3.5c-1.8.8-4.1 1.2-6.8 1.2-2.6 0-4.9-.4-6.8-1.2-2-.8-3.4-2-4.5-3.5a10 10 0 0 1-1.7-5.6h6a5 5 0 0 0 3.5 4.6c1 .4 2.2.6 3.4.6 1.3 0 2.5-.2 3.5-.6 1-.4 1.8-1 2.4-1.7a4 4 0 0 0 .8-2.4c0-.9-.2-1.6-.7-2.2a11 11 0 0 0-2.1-1.4l-3.2-1-3.8-1c-2.8-.7-5-1.7-6.6-3.2a7.2 7.2 0 0 1-2.4-5.7 8 8 0 0 1 1.7-5 10 10 0 0 1 4.3-3.5c2-.8 4-1.2 6.4-1.2 2.3 0 4.4.4 6.2 1.2 1.8.8 3.2 2 4.3 3.4 1 1.4 1.5 3 1.5 5h-5.8z"/></svg>
|
|
Before Width: | Height: | Size: 1.3 KiB |
@@ -1 +0,0 @@
|
|||||||
<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1155 1000"><path d="m577.3 0 577.4 1000H0z" fill="#fff"/></svg>
|
|
Before Width: | Height: | Size: 128 B |
@@ -1 +0,0 @@
|
|||||||
<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16"><path fill-rule="evenodd" clip-rule="evenodd" d="M1.5 2.5h13v10a1 1 0 0 1-1 1h-11a1 1 0 0 1-1-1zM0 1h16v11.5a2.5 2.5 0 0 1-2.5 2.5h-11A2.5 2.5 0 0 1 0 12.5zm3.75 4.5a.75.75 0 1 0 0-1.5.75.75 0 0 0 0 1.5M7 4.75a.75.75 0 1 1-1.5 0 .75.75 0 0 1 1.5 0m1.75.75a.75.75 0 1 0 0-1.5.75.75 0 0 0 0 1.5" fill="#666"/></svg>
|
|
Before Width: | Height: | Size: 385 B |
Binary file not shown.
Before Width: | Height: | Size: 25 KiB |
@@ -1,76 +0,0 @@
|
|||||||
@tailwind base;
|
|
||||||
@tailwind components;
|
|
||||||
@tailwind utilities;
|
|
||||||
|
|
||||||
@layer base {
|
|
||||||
:root {
|
|
||||||
--background: 0 0% 100%;
|
|
||||||
--foreground: 222.2 84% 4.9%;
|
|
||||||
|
|
||||||
--card: 0 0% 100%;
|
|
||||||
--card-foreground: 222.2 84% 4.9%;
|
|
||||||
|
|
||||||
--popover: 0 0% 100%;
|
|
||||||
--popover-foreground: 222.2 84% 4.9%;
|
|
||||||
|
|
||||||
--primary: 222.2 47.4% 11.2%;
|
|
||||||
--primary-foreground: 210 40% 98%;
|
|
||||||
|
|
||||||
--secondary: 210 40% 96.1%;
|
|
||||||
--secondary-foreground: 222.2 47.4% 11.2%;
|
|
||||||
|
|
||||||
--muted: 210 40% 96.1%;
|
|
||||||
--muted-foreground: 215.4 16.3% 46.9%;
|
|
||||||
|
|
||||||
--accent: 210 40% 96.1%;
|
|
||||||
--accent-foreground: 222.2 47.4% 11.2%;
|
|
||||||
|
|
||||||
--destructive: 0 84.2% 60.2%;
|
|
||||||
--destructive-foreground: 210 40% 98%;
|
|
||||||
|
|
||||||
--border: 214.3 31.8% 91.4%;
|
|
||||||
--input: 214.3 31.8% 91.4%;
|
|
||||||
--ring: 222.2 84% 4.9%;
|
|
||||||
|
|
||||||
--radius: 0.5rem;
|
|
||||||
}
|
|
||||||
|
|
||||||
.dark {
|
|
||||||
--background: 222.2 84% 4.9%;
|
|
||||||
--foreground: 210 40% 98%;
|
|
||||||
|
|
||||||
--card: 222.2 84% 4.9%;
|
|
||||||
--card-foreground: 210 40% 98%;
|
|
||||||
|
|
||||||
--popover: 222.2 84% 4.9%;
|
|
||||||
--popover-foreground: 210 40% 98%;
|
|
||||||
|
|
||||||
--primary: 210 40% 98%;
|
|
||||||
--primary-foreground: 222.2 47.4% 11.2%;
|
|
||||||
|
|
||||||
--secondary: 217.2 32.6% 17.5%;
|
|
||||||
--secondary-foreground: 210 40% 98%;
|
|
||||||
|
|
||||||
--muted: 217.2 32.6% 17.5%;
|
|
||||||
--muted-foreground: 215 20.2% 65.1%;
|
|
||||||
|
|
||||||
--accent: 217.2 32.6% 17.5%;
|
|
||||||
--accent-foreground: 210 40% 98%;
|
|
||||||
|
|
||||||
--destructive: 0 62.8% 30.6%;
|
|
||||||
--destructive-foreground: 210 40% 98%;
|
|
||||||
|
|
||||||
--border: 217.2 32.6% 17.5%;
|
|
||||||
--input: 217.2 32.6% 17.5%;
|
|
||||||
--ring: 212.7 26.8% 83.9%;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
@layer base {
|
|
||||||
* {
|
|
||||||
@apply border-border;
|
|
||||||
}
|
|
||||||
body {
|
|
||||||
@apply bg-background text-foreground;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,34 +0,0 @@
|
|||||||
import type { Metadata } from "next";
|
|
||||||
import { Geist, Geist_Mono } from "next/font/google";
|
|
||||||
import "./globals.css";
|
|
||||||
|
|
||||||
const geistSans = Geist({
|
|
||||||
variable: "--font-geist-sans",
|
|
||||||
subsets: ["latin"],
|
|
||||||
});
|
|
||||||
|
|
||||||
const geistMono = Geist_Mono({
|
|
||||||
variable: "--font-geist-mono",
|
|
||||||
subsets: ["latin"],
|
|
||||||
});
|
|
||||||
|
|
||||||
export const metadata: Metadata = {
|
|
||||||
title: "Create Next App",
|
|
||||||
description: "Generated by create next app",
|
|
||||||
};
|
|
||||||
|
|
||||||
export default function RootLayout({
|
|
||||||
children,
|
|
||||||
}: Readonly<{
|
|
||||||
children: React.ReactNode;
|
|
||||||
}>) {
|
|
||||||
return (
|
|
||||||
<html lang="en">
|
|
||||||
<body
|
|
||||||
className={`${geistSans.variable} ${geistMono.variable} antialiased`}
|
|
||||||
>
|
|
||||||
{children}
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
);
|
|
||||||
}
|
|
@@ -1,26 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { ServerStatus } from "@/components/server-status"
|
|
||||||
import { SearchForm } from "@/components/search-form"
|
|
||||||
import { SearchResults } from "@/components/search-results"
|
|
||||||
import { useState } from "react"
|
|
||||||
|
|
||||||
export default function Home() {
|
|
||||||
const [searchResults, setSearchResults] = useState([])
|
|
||||||
const [isSearching, setIsSearching] = useState(false)
|
|
||||||
|
|
||||||
const services = [
|
|
||||||
{ name: "Ollama", status: "running" as const },
|
|
||||||
{ name: "SearxNG", status: "running" as const },
|
|
||||||
{ name: "Supabase", status: "running" as const }
|
|
||||||
]
|
|
||||||
|
|
||||||
return (
|
|
||||||
<main className="container mx-auto p-4">
|
|
||||||
<h1 className="text-4xl font-bold text-center mb-8">Business Search</h1>
|
|
||||||
<SearchForm onSearch={setSearchResults} onSearchingChange={setIsSearching} />
|
|
||||||
<SearchResults results={searchResults} isLoading={isSearching} />
|
|
||||||
<ServerStatus services={services} />
|
|
||||||
</main>
|
|
||||||
)
|
|
||||||
}
|
|
@@ -1,79 +0,0 @@
|
|||||||
import { Search } from "lucide-react"
|
|
||||||
import { useState } from "react"
|
|
||||||
|
|
||||||
interface SearchFormProps {
|
|
||||||
onSearch: (results: any[]) => void;
|
|
||||||
onSearchingChange: (isSearching: boolean) => void;
|
|
||||||
}
|
|
||||||
|
|
||||||
export function SearchForm({ onSearch, onSearchingChange }: SearchFormProps) {
|
|
||||||
const [query, setQuery] = useState("")
|
|
||||||
const [error, setError] = useState<string | null>(null)
|
|
||||||
|
|
||||||
const handleSearch = async (e: React.FormEvent) => {
|
|
||||||
e.preventDefault()
|
|
||||||
if (!query.trim()) return
|
|
||||||
|
|
||||||
setError(null)
|
|
||||||
onSearchingChange(true)
|
|
||||||
try {
|
|
||||||
const response = await fetch("/api/search", {
|
|
||||||
method: "POST",
|
|
||||||
headers: {
|
|
||||||
"Content-Type": "application/json",
|
|
||||||
},
|
|
||||||
body: JSON.stringify({ query: query.trim() }),
|
|
||||||
})
|
|
||||||
|
|
||||||
if (!response.ok) {
|
|
||||||
throw new Error("Search failed")
|
|
||||||
}
|
|
||||||
|
|
||||||
const data = await response.json()
|
|
||||||
onSearch(data.results || [])
|
|
||||||
|
|
||||||
} catch (error) {
|
|
||||||
console.error("Search error:", error)
|
|
||||||
onSearch([])
|
|
||||||
setError("Failed to perform search. Please try again.")
|
|
||||||
} finally {
|
|
||||||
onSearchingChange(false)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="w-full max-w-2xl mx-auto mt-8 mb-12">
|
|
||||||
<div className="flex flex-col gap-4">
|
|
||||||
<div className="flex flex-col gap-2">
|
|
||||||
<label htmlFor="search" className="text-lg font-medium text-center">
|
|
||||||
Find local businesses
|
|
||||||
</label>
|
|
||||||
<form onSubmit={handleSearch} className="relative">
|
|
||||||
<input
|
|
||||||
id="search"
|
|
||||||
type="text"
|
|
||||||
value={query}
|
|
||||||
onChange={(e) => setQuery(e.target.value)}
|
|
||||||
placeholder="e.g. plumbers in Denver, CO"
|
|
||||||
className="w-full px-4 py-3 text-lg rounded-lg border border-border bg-background focus:outline-none focus:ring-2 focus:ring-primary"
|
|
||||||
/>
|
|
||||||
<button
|
|
||||||
type="submit"
|
|
||||||
disabled={!query.trim()}
|
|
||||||
className="absolute right-2 top-1/2 -translate-y-1/2 p-3 rounded-md bg-primary text-primary-foreground hover:bg-primary/90 transition-colors disabled:opacity-50 disabled:cursor-not-allowed"
|
|
||||||
aria-label="Search"
|
|
||||||
>
|
|
||||||
<Search className="h-5 w-5" />
|
|
||||||
</button>
|
|
||||||
</form>
|
|
||||||
{error && (
|
|
||||||
<p className="text-sm text-destructive text-center">{error}</p>
|
|
||||||
)}
|
|
||||||
<p className="text-sm text-muted-foreground text-center mt-2">
|
|
||||||
Try searching for: restaurants, dentists, electricians, etc.
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
@@ -1,76 +0,0 @@
|
|||||||
interface Business {
|
|
||||||
id: string;
|
|
||||||
name: string;
|
|
||||||
address: string;
|
|
||||||
phone: string;
|
|
||||||
website?: string;
|
|
||||||
email?: string;
|
|
||||||
description?: string;
|
|
||||||
rating?: number;
|
|
||||||
}
|
|
||||||
|
|
||||||
interface SearchResultsProps {
|
|
||||||
results: Business[];
|
|
||||||
isLoading: boolean;
|
|
||||||
}
|
|
||||||
|
|
||||||
export function SearchResults({ results, isLoading }: SearchResultsProps) {
|
|
||||||
if (isLoading) {
|
|
||||||
return (
|
|
||||||
<div className="w-full max-w-4xl mx-auto mt-8">
|
|
||||||
<div className="animate-pulse space-y-4">
|
|
||||||
{[...Array(3)].map((_, i) => (
|
|
||||||
<div key={i} className="bg-muted rounded-lg p-6">
|
|
||||||
<div className="h-4 bg-muted-foreground/20 rounded w-3/4 mb-4"></div>
|
|
||||||
<div className="h-3 bg-muted-foreground/20 rounded w-1/2"></div>
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!results.length) {
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="w-full max-w-4xl mx-auto mt-8">
|
|
||||||
<div className="space-y-4">
|
|
||||||
{results.map((business) => (
|
|
||||||
<div key={business.id} className="bg-card rounded-lg p-6 shadow-sm">
|
|
||||||
<h3 className="text-xl font-semibold mb-2">{business.name}</h3>
|
|
||||||
{business.address && (
|
|
||||||
<p className="text-muted-foreground mb-2">{business.address}</p>
|
|
||||||
)}
|
|
||||||
<div className="flex flex-wrap gap-4 text-sm">
|
|
||||||
{business.phone && (
|
|
||||||
<a
|
|
||||||
href={`tel:${business.phone}`}
|
|
||||||
className="text-primary hover:underline"
|
|
||||||
>
|
|
||||||
{business.phone}
|
|
||||||
</a>
|
|
||||||
)}
|
|
||||||
{business.website && (
|
|
||||||
<a
|
|
||||||
href={business.website}
|
|
||||||
target="_blank"
|
|
||||||
rel="noopener noreferrer"
|
|
||||||
className="text-primary hover:underline"
|
|
||||||
>
|
|
||||||
Visit Website
|
|
||||||
</a>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
{business.description && (
|
|
||||||
<p className="mt-4 text-sm text-muted-foreground">
|
|
||||||
{business.description}
|
|
||||||
</p>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
@@ -1,59 +0,0 @@
|
|||||||
import { CheckCircle2, XCircle, AlertCircle } from "lucide-react"
|
|
||||||
import { Alert, AlertDescription, AlertTitle } from "@/components/ui/alert"
|
|
||||||
|
|
||||||
interface ServiceStatus {
|
|
||||||
name: string
|
|
||||||
status: "running" | "error" | "warning"
|
|
||||||
}
|
|
||||||
|
|
||||||
interface ServerStatusProps {
|
|
||||||
services: ServiceStatus[]
|
|
||||||
error?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export function ServerStatus({ services, error }: ServerStatusProps) {
|
|
||||||
if (error) {
|
|
||||||
return (
|
|
||||||
<Alert variant="destructive" className="max-w-md mx-auto mt-4">
|
|
||||||
<XCircle className="h-4 w-4" />
|
|
||||||
<AlertTitle>Server Error</AlertTitle>
|
|
||||||
<AlertDescription>{error}</AlertDescription>
|
|
||||||
</Alert>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-4 max-w-md mx-auto mt-4">
|
|
||||||
<h2 className="text-xl font-semibold text-center mb-6">Service Status</h2>
|
|
||||||
<div className="space-y-3">
|
|
||||||
{services.map((service) => (
|
|
||||||
<Alert
|
|
||||||
key={service.name}
|
|
||||||
variant={service.status === "error" ? "destructive" : "default"}
|
|
||||||
className="flex items-center justify-between hover:bg-accent/50 transition-colors"
|
|
||||||
>
|
|
||||||
<div className="flex items-center gap-3">
|
|
||||||
{service.status === "running" && (
|
|
||||||
<CheckCircle2 className="h-5 w-5 text-green-500 shrink-0" />
|
|
||||||
)}
|
|
||||||
{service.status === "error" && (
|
|
||||||
<XCircle className="h-5 w-5 text-red-500 shrink-0" />
|
|
||||||
)}
|
|
||||||
{service.status === "warning" && (
|
|
||||||
<AlertCircle className="h-5 w-5 text-yellow-500 shrink-0" />
|
|
||||||
)}
|
|
||||||
<AlertTitle className="font-medium">{service.name}</AlertTitle>
|
|
||||||
</div>
|
|
||||||
<span className={`text-sm ${
|
|
||||||
service.status === "running" ? "text-green-600" :
|
|
||||||
service.status === "error" ? "text-red-600" :
|
|
||||||
"text-yellow-600"
|
|
||||||
}`}>
|
|
||||||
{service.status.charAt(0).toUpperCase() + service.status.slice(1)}
|
|
||||||
</span>
|
|
||||||
</Alert>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
@@ -1,58 +0,0 @@
|
|||||||
import * as React from "react"
|
|
||||||
import { cva, type VariantProps } from "class-variance-authority"
|
|
||||||
import { cn } from "@/lib/utils"
|
|
||||||
|
|
||||||
const alertVariants = cva(
|
|
||||||
"relative w-full rounded-lg border p-4 [&>svg~*]:pl-7 [&>svg+div]:translate-y-[-3px] [&>svg]:absolute [&>svg]:left-4 [&>svg]:top-4 [&>svg]:text-foreground",
|
|
||||||
{
|
|
||||||
variants: {
|
|
||||||
variant: {
|
|
||||||
default: "bg-background text-foreground",
|
|
||||||
destructive:
|
|
||||||
"border-destructive/50 text-destructive dark:border-destructive [&>svg]:text-destructive",
|
|
||||||
},
|
|
||||||
},
|
|
||||||
defaultVariants: {
|
|
||||||
variant: "default",
|
|
||||||
},
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
const Alert = React.forwardRef<
|
|
||||||
HTMLDivElement,
|
|
||||||
React.HTMLAttributes<HTMLDivElement> & VariantProps<typeof alertVariants>
|
|
||||||
>(({ className, variant, ...props }, ref) => (
|
|
||||||
<div
|
|
||||||
ref={ref}
|
|
||||||
role="alert"
|
|
||||||
className={cn(alertVariants({ variant }), className)}
|
|
||||||
{...props}
|
|
||||||
/>
|
|
||||||
))
|
|
||||||
Alert.displayName = "Alert"
|
|
||||||
|
|
||||||
const AlertTitle = React.forwardRef<
|
|
||||||
HTMLParagraphElement,
|
|
||||||
React.HTMLAttributes<HTMLHeadingElement>
|
|
||||||
>(({ className, ...props }, ref) => (
|
|
||||||
<h5
|
|
||||||
ref={ref}
|
|
||||||
className={cn("mb-1 font-medium leading-none tracking-tight", className)}
|
|
||||||
{...props}
|
|
||||||
/>
|
|
||||||
))
|
|
||||||
AlertTitle.displayName = "AlertTitle"
|
|
||||||
|
|
||||||
const AlertDescription = React.forwardRef<
|
|
||||||
HTMLParagraphElement,
|
|
||||||
React.HTMLAttributes<HTMLParagraphElement>
|
|
||||||
>(({ className, ...props }, ref) => (
|
|
||||||
<div
|
|
||||||
ref={ref}
|
|
||||||
className={cn("text-sm [&_p]:leading-relaxed", className)}
|
|
||||||
{...props}
|
|
||||||
/>
|
|
||||||
))
|
|
||||||
AlertDescription.displayName = "AlertDescription"
|
|
||||||
|
|
||||||
export { Alert, AlertTitle, AlertDescription }
|
|
@@ -1,6 +0,0 @@
|
|||||||
import { type ClassValue, clsx } from "clsx"
|
|
||||||
import { twMerge } from "tailwind-merge"
|
|
||||||
|
|
||||||
export function cn(...inputs: ClassValue[]) {
|
|
||||||
return twMerge(clsx(inputs))
|
|
||||||
}
|
|
@@ -1,79 +0,0 @@
|
|||||||
import type { Config } from "tailwindcss";
|
|
||||||
|
|
||||||
const config: Config = {
|
|
||||||
darkMode: ["class"],
|
|
||||||
content: [
|
|
||||||
'./pages/**/*.{ts,tsx}',
|
|
||||||
'./components/**/*.{ts,tsx}',
|
|
||||||
'./app/**/*.{ts,tsx}',
|
|
||||||
'./src/**/*.{ts,tsx}',
|
|
||||||
],
|
|
||||||
theme: {
|
|
||||||
container: {
|
|
||||||
center: true,
|
|
||||||
padding: "2rem",
|
|
||||||
screens: {
|
|
||||||
"2xl": "1400px",
|
|
||||||
},
|
|
||||||
},
|
|
||||||
extend: {
|
|
||||||
colors: {
|
|
||||||
border: "hsl(var(--border))",
|
|
||||||
input: "hsl(var(--input))",
|
|
||||||
ring: "hsl(var(--ring))",
|
|
||||||
background: "hsl(var(--background))",
|
|
||||||
foreground: "hsl(var(--foreground))",
|
|
||||||
primary: {
|
|
||||||
DEFAULT: "hsl(var(--primary))",
|
|
||||||
foreground: "hsl(var(--primary-foreground))",
|
|
||||||
},
|
|
||||||
secondary: {
|
|
||||||
DEFAULT: "hsl(var(--secondary))",
|
|
||||||
foreground: "hsl(var(--secondary-foreground))",
|
|
||||||
},
|
|
||||||
destructive: {
|
|
||||||
DEFAULT: "hsl(var(--destructive))",
|
|
||||||
foreground: "hsl(var(--destructive-foreground))",
|
|
||||||
},
|
|
||||||
muted: {
|
|
||||||
DEFAULT: "hsl(var(--muted))",
|
|
||||||
foreground: "hsl(var(--muted-foreground))",
|
|
||||||
},
|
|
||||||
accent: {
|
|
||||||
DEFAULT: "hsl(var(--accent))",
|
|
||||||
foreground: "hsl(var(--accent-foreground))",
|
|
||||||
},
|
|
||||||
popover: {
|
|
||||||
DEFAULT: "hsl(var(--popover))",
|
|
||||||
foreground: "hsl(var(--popover-foreground))",
|
|
||||||
},
|
|
||||||
card: {
|
|
||||||
DEFAULT: "hsl(var(--card))",
|
|
||||||
foreground: "hsl(var(--card-foreground))",
|
|
||||||
},
|
|
||||||
},
|
|
||||||
borderRadius: {
|
|
||||||
lg: "var(--radius)",
|
|
||||||
md: "calc(var(--radius) - 2px)",
|
|
||||||
sm: "calc(var(--radius) - 4px)",
|
|
||||||
},
|
|
||||||
keyframes: {
|
|
||||||
"accordion-down": {
|
|
||||||
from: { height: "0" },
|
|
||||||
to: { height: "var(--radix-accordion-content-height)" },
|
|
||||||
},
|
|
||||||
"accordion-up": {
|
|
||||||
from: { height: "var(--radix-accordion-content-height)" },
|
|
||||||
to: { height: "0" },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
animation: {
|
|
||||||
"accordion-down": "accordion-down 0.2s ease-out",
|
|
||||||
"accordion-up": "accordion-up 0.2s ease-out",
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
plugins: [require("tailwindcss-animate")],
|
|
||||||
}
|
|
||||||
|
|
||||||
export default config;
|
|
@@ -1,27 +0,0 @@
|
|||||||
{
|
|
||||||
"compilerOptions": {
|
|
||||||
"target": "ES2017",
|
|
||||||
"lib": ["dom", "dom.iterable", "esnext"],
|
|
||||||
"allowJs": true,
|
|
||||||
"skipLibCheck": true,
|
|
||||||
"strict": true,
|
|
||||||
"noEmit": true,
|
|
||||||
"esModuleInterop": true,
|
|
||||||
"module": "esnext",
|
|
||||||
"moduleResolution": "bundler",
|
|
||||||
"resolveJsonModule": true,
|
|
||||||
"isolatedModules": true,
|
|
||||||
"jsx": "preserve",
|
|
||||||
"incremental": true,
|
|
||||||
"plugins": [
|
|
||||||
{
|
|
||||||
"name": "next"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"paths": {
|
|
||||||
"@/*": ["./src/*"]
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"include": ["next-env.d.ts", "**/*.ts", "**/*.tsx", ".next/types/**/*.ts"],
|
|
||||||
"exclude": ["node_modules"]
|
|
||||||
}
|
|
@@ -1,17 +0,0 @@
|
|||||||
module.exports = {
|
|
||||||
preset: 'ts-jest',
|
|
||||||
testEnvironment: 'node',
|
|
||||||
roots: ['<rootDir>/src'],
|
|
||||||
testMatch: ['**/__tests__/**/*.ts', '**/?(*.)+(spec|test).ts'],
|
|
||||||
transform: {
|
|
||||||
'^.+\\.ts$': 'ts-jest',
|
|
||||||
},
|
|
||||||
moduleFileExtensions: ['ts', 'js', 'json', 'node'],
|
|
||||||
collectCoverageFrom: [
|
|
||||||
'src/**/*.{ts,js}',
|
|
||||||
'!src/tests/**',
|
|
||||||
'!**/node_modules/**',
|
|
||||||
],
|
|
||||||
coverageDirectory: 'coverage',
|
|
||||||
setupFilesAfterEnv: ['<rootDir>/src/tests/setup.ts'],
|
|
||||||
};
|
|
14318
package-lock.json
generated
14318
package-lock.json
generated
File diff suppressed because it is too large
Load Diff
41
package.json
41
package.json
@@ -1,80 +1,53 @@
|
|||||||
{
|
{
|
||||||
"name": "perplexica-backend",
|
"name": "perplexica-backend",
|
||||||
"version": "1.10.0-rc2",
|
"version": "1.10.0-rc3",
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"author": "ItzCrazyKns",
|
"author": "ItzCrazyKns",
|
||||||
"scripts": {
|
"scripts": {
|
||||||
"start": "ts-node src/index.ts",
|
"start": "npm run db:push && node dist/app.js",
|
||||||
"build": "tsc",
|
"build": "tsc",
|
||||||
"dev": "nodemon src/index.ts",
|
"dev": "nodemon --ignore uploads/ src/app.ts ",
|
||||||
"db:push": "drizzle-kit push sqlite",
|
"db:push": "drizzle-kit push sqlite",
|
||||||
"format": "prettier . --check",
|
"format": "prettier . --check",
|
||||||
"format:write": "prettier . --write",
|
"format:write": "prettier . --write"
|
||||||
"test:search": "ts-node src/tests/testSearch.ts",
|
|
||||||
"test:supabase": "ts-node src/tests/supabaseTest.ts",
|
|
||||||
"test:deepseek": "ts-node src/tests/testDeepseek.ts",
|
|
||||||
"test:ollama": "ts-node src/tests/testOllama.ts",
|
|
||||||
"test": "jest",
|
|
||||||
"test:watch": "jest --watch",
|
|
||||||
"test:coverage": "jest --coverage",
|
|
||||||
"build:css": "tailwindcss -i ./src/styles/input.css -o ./public/styles/output.css",
|
|
||||||
"watch:css": "tailwindcss -i ./src/styles/input.css -o ./public/styles/output.css --watch"
|
|
||||||
},
|
},
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
"@testing-library/jest-dom": "^6.1.5",
|
|
||||||
"@types/better-sqlite3": "^7.6.10",
|
"@types/better-sqlite3": "^7.6.10",
|
||||||
"@types/cors": "^2.8.17",
|
"@types/cors": "^2.8.17",
|
||||||
"@types/express": "^4.17.21",
|
"@types/express": "^4.17.21",
|
||||||
"@types/html-to-text": "^9.0.4",
|
"@types/html-to-text": "^9.0.4",
|
||||||
"@types/jest": "^29.5.11",
|
|
||||||
"@types/multer": "^1.4.12",
|
"@types/multer": "^1.4.12",
|
||||||
"@types/node-fetch": "^2.6.12",
|
|
||||||
"@types/pdf-parse": "^1.1.4",
|
"@types/pdf-parse": "^1.1.4",
|
||||||
"@types/readable-stream": "^4.0.11",
|
"@types/readable-stream": "^4.0.11",
|
||||||
"@types/supertest": "^6.0.2",
|
|
||||||
"@types/ws": "^8.5.12",
|
"@types/ws": "^8.5.12",
|
||||||
"autoprefixer": "^10.4.20",
|
|
||||||
"drizzle-kit": "^0.22.7",
|
"drizzle-kit": "^0.22.7",
|
||||||
"jest": "^29.7.0",
|
|
||||||
"nodemon": "^3.1.0",
|
"nodemon": "^3.1.0",
|
||||||
"postcss": "^8.4.49",
|
|
||||||
"prettier": "^3.2.5",
|
"prettier": "^3.2.5",
|
||||||
"supertest": "^7.0.0",
|
|
||||||
"tailwindcss": "^3.4.17",
|
|
||||||
"ts-jest": "^29.1.1",
|
|
||||||
"ts-node": "^10.9.2",
|
"ts-node": "^10.9.2",
|
||||||
"typescript": "^5.4.3"
|
"typescript": "^5.4.3"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@huggingface/transformers": "latest",
|
|
||||||
"@iarna/toml": "^2.2.5",
|
"@iarna/toml": "^2.2.5",
|
||||||
"@langchain/anthropic": "^0.2.3",
|
"@langchain/anthropic": "^0.2.3",
|
||||||
"@langchain/community": "^0.2.16",
|
"@langchain/community": "^0.2.16",
|
||||||
"@langchain/google-genai": "^0.0.23",
|
"@langchain/google-genai": "^0.0.23",
|
||||||
"@langchain/openai": "^0.0.25",
|
"@langchain/openai": "^0.0.25",
|
||||||
"@shadcn/ui": "^0.0.4",
|
|
||||||
"@supabase/supabase-js": "^2.47.10",
|
|
||||||
"@xenova/transformers": "^2.17.1",
|
"@xenova/transformers": "^2.17.1",
|
||||||
"axios": "^1.6.8",
|
"axios": "^1.6.8",
|
||||||
"better-sqlite3": "^11.7.0",
|
"better-sqlite3": "^11.0.0",
|
||||||
"cheerio": "^1.0.0",
|
|
||||||
"compute-cosine-similarity": "^1.1.0",
|
"compute-cosine-similarity": "^1.1.0",
|
||||||
"compute-dot": "^1.1.0",
|
"compute-dot": "^1.1.0",
|
||||||
"cors": "^2.8.5",
|
"cors": "^2.8.5",
|
||||||
"dotenv": "^16.4.7",
|
"dotenv": "^16.4.5",
|
||||||
"drizzle-orm": "^0.31.2",
|
"drizzle-orm": "^0.31.2",
|
||||||
"express": "^4.19.2",
|
"express": "^4.19.2",
|
||||||
"html-to-text": "^9.0.5",
|
"html-to-text": "^9.0.5",
|
||||||
"langchain": "^0.1.30",
|
"langchain": "^0.1.30",
|
||||||
"mammoth": "^1.8.0",
|
"mammoth": "^1.8.0",
|
||||||
"multer": "^1.4.5-lts.1",
|
"multer": "^1.4.5-lts.1",
|
||||||
"node-fetch": "^2.7.0",
|
|
||||||
"pdf-parse": "^1.1.1",
|
"pdf-parse": "^1.1.1",
|
||||||
"robots-parser": "^3.0.1",
|
|
||||||
"tesseract.js": "^4.1.4",
|
|
||||||
"torch": "latest",
|
|
||||||
"winston": "^3.13.0",
|
"winston": "^3.13.0",
|
||||||
"ws": "^8.17.1",
|
"ws": "^8.17.1",
|
||||||
"zod": "^3.24.1"
|
"zod": "^3.22.4"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@@ -1,6 +0,0 @@
|
|||||||
module.exports = {
|
|
||||||
plugins: {
|
|
||||||
tailwindcss: {},
|
|
||||||
autoprefixer: {},
|
|
||||||
},
|
|
||||||
}
|
|
@@ -1,214 +0,0 @@
|
|||||||
<!DOCTYPE html>
|
|
||||||
<html lang="en" class="h-full bg-gray-50">
|
|
||||||
<head>
|
|
||||||
<meta charset="UTF-8">
|
|
||||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
||||||
<title>OffMarket Pro - Business Search</title>
|
|
||||||
<link href="/styles/output.css" rel="stylesheet">
|
|
||||||
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&display=swap" rel="stylesheet">
|
|
||||||
</head>
|
|
||||||
<body class="min-h-full">
|
|
||||||
<div class="bg-white">
|
|
||||||
<!-- Navigation -->
|
|
||||||
<nav class="bg-white shadow-sm">
|
|
||||||
<div class="mx-auto max-w-7xl px-4 sm:px-6 lg:px-8">
|
|
||||||
<div class="flex h-16 justify-between items-center">
|
|
||||||
<div class="flex-shrink-0 flex items-center">
|
|
||||||
<h1 class="text-xl font-bold text-gray-900">OffMarket Pro</h1>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</nav>
|
|
||||||
|
|
||||||
<!-- Main Content -->
|
|
||||||
<main class="mx-auto max-w-7xl px-4 sm:px-6 lg:px-8 py-8">
|
|
||||||
<!-- Search Form -->
|
|
||||||
<div class="mb-8">
|
|
||||||
<h2 class="text-2xl font-bold text-gray-900 mb-6">Find Off-Market Property Services</h2>
|
|
||||||
<div class="grid grid-cols-1 gap-4 sm:grid-cols-2">
|
|
||||||
<div>
|
|
||||||
<label for="searchQuery" class="block text-sm font-medium text-gray-700">Service Type</label>
|
|
||||||
<input type="text" id="searchQuery" class="mt-1 block w-full rounded-md border-gray-300 shadow-sm focus:border-primary focus:ring-primary sm:text-sm" placeholder="e.g. plumber, electrician">
|
|
||||||
</div>
|
|
||||||
<div>
|
|
||||||
<label for="searchLocation" class="block text-sm font-medium text-gray-700">Location</label>
|
|
||||||
<input type="text" id="searchLocation" class="mt-1 block w-full rounded-md border-gray-300 shadow-sm focus:border-primary focus:ring-primary sm:text-sm" placeholder="e.g. Denver, CO">
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
<div class="mt-4">
|
|
||||||
<button onclick="performSearch()" class="inline-flex items-center px-4 py-2 border border-transparent text-sm font-medium rounded-md shadow-sm text-white bg-primary hover:bg-primary-hover focus:outline-none focus:ring-2 focus:ring-offset-2 focus:ring-primary">
|
|
||||||
Search
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- Progress Indicator -->
|
|
||||||
<div id="searchProgress" class="hidden mb-8">
|
|
||||||
<div class="bg-white shadow sm:rounded-lg">
|
|
||||||
<div class="px-4 py-5 sm:p-6">
|
|
||||||
<h3 class="text-lg font-medium leading-6 text-gray-900">Search Progress</h3>
|
|
||||||
<div class="mt-4">
|
|
||||||
<div class="relative pt-1">
|
|
||||||
<div class="overflow-hidden h-2 mb-4 text-xs flex rounded bg-gray-200">
|
|
||||||
<div id="progressBar" class="shadow-none flex flex-col text-center whitespace-nowrap text-white justify-center bg-primary transition-all duration-500" style="width: 0%"></div>
|
|
||||||
</div>
|
|
||||||
<div id="progressText" class="text-sm text-gray-600"></div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- Error Display -->
|
|
||||||
<div id="errorDisplay" class="hidden mb-8">
|
|
||||||
<div class="rounded-md bg-red-50 p-4">
|
|
||||||
<div class="flex">
|
|
||||||
<div class="flex-shrink-0">
|
|
||||||
<svg class="h-5 w-5 text-red-400" viewBox="0 0 20 20" fill="currentColor">
|
|
||||||
<path fill-rule="evenodd" d="M10 18a8 8 0 100-16 8 8 0 000 16zM8.707 7.293a1 1 0 00-1.414 1.414L8.586 10l-1.293 1.293a1 1 0 101.414 1.414L10 11.414l1.293 1.293a1 1 0 001.414-1.414L11.414 10l1.293-1.293a1 1 0 00-1.414-1.414L10 8.586 8.707 7.293z" clip-rule="evenodd"/>
|
|
||||||
</svg>
|
|
||||||
</div>
|
|
||||||
<div class="ml-3">
|
|
||||||
<h3 class="text-sm font-medium text-red-800">Error</h3>
|
|
||||||
<div class="mt-2 text-sm text-red-700">
|
|
||||||
<p id="errorMessage"></p>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- Results Table -->
|
|
||||||
<div id="resultsContainer" class="hidden">
|
|
||||||
<div class="bg-white shadow overflow-hidden sm:rounded-lg">
|
|
||||||
<div class="px-4 py-5 sm:px-6">
|
|
||||||
<h3 class="text-lg leading-6 font-medium text-gray-900">Search Results</h3>
|
|
||||||
</div>
|
|
||||||
<div class="border-t border-gray-200">
|
|
||||||
<div class="overflow-x-auto">
|
|
||||||
<table class="min-w-full divide-y divide-gray-200">
|
|
||||||
<thead class="bg-gray-50">
|
|
||||||
<tr>
|
|
||||||
<th scope="col" class="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">Business</th>
|
|
||||||
<th scope="col" class="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">Contact</th>
|
|
||||||
<th scope="col" class="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">Actions</th>
|
|
||||||
</tr>
|
|
||||||
</thead>
|
|
||||||
<tbody id="resultsBody" class="bg-white divide-y divide-gray-200">
|
|
||||||
<!-- Results will be inserted here -->
|
|
||||||
</tbody>
|
|
||||||
</table>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</main>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<script>
|
|
||||||
class SearchProgress {
|
|
||||||
constructor() {
|
|
||||||
this.progressBar = document.getElementById('progressBar');
|
|
||||||
this.progressText = document.getElementById('progressText');
|
|
||||||
this.container = document.getElementById('searchProgress');
|
|
||||||
}
|
|
||||||
|
|
||||||
show() {
|
|
||||||
this.container.classList.remove('hidden');
|
|
||||||
this.setProgress(0, 'Starting search...');
|
|
||||||
}
|
|
||||||
|
|
||||||
hide() {
|
|
||||||
this.container.classList.add('hidden');
|
|
||||||
}
|
|
||||||
|
|
||||||
setProgress(percent, message) {
|
|
||||||
this.progressBar.style.width = `${percent}%`;
|
|
||||||
this.progressText.textContent = message;
|
|
||||||
}
|
|
||||||
|
|
||||||
showError(message) {
|
|
||||||
this.setProgress(100, `Error: ${message}`);
|
|
||||||
this.progressBar.classList.remove('bg-primary');
|
|
||||||
this.progressBar.classList.add('bg-red-500');
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async function performSearch() {
|
|
||||||
const query = document.getElementById('searchQuery').value;
|
|
||||||
const location = document.getElementById('searchLocation').value;
|
|
||||||
|
|
||||||
if (!query || !location) {
|
|
||||||
showError('Please enter both search query and location');
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
const progress = new SearchProgress();
|
|
||||||
progress.show();
|
|
||||||
|
|
||||||
try {
|
|
||||||
document.getElementById('errorDisplay').classList.add('hidden');
|
|
||||||
document.getElementById('resultsContainer').classList.add('hidden');
|
|
||||||
|
|
||||||
const response = await fetch('/api/search', {
|
|
||||||
method: 'POST',
|
|
||||||
headers: {
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
},
|
|
||||||
body: JSON.stringify({ query, location })
|
|
||||||
});
|
|
||||||
|
|
||||||
const data = await response.json();
|
|
||||||
|
|
||||||
if (!data.success) {
|
|
||||||
throw new Error(data.error || 'Search failed');
|
|
||||||
}
|
|
||||||
|
|
||||||
displayResults(data.results);
|
|
||||||
progress.hide();
|
|
||||||
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Search error:', error);
|
|
||||||
progress.showError(error.message);
|
|
||||||
showError(error.message);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function showError(message) {
|
|
||||||
const errorDisplay = document.getElementById('errorDisplay');
|
|
||||||
const errorMessage = document.getElementById('errorMessage');
|
|
||||||
errorMessage.textContent = message;
|
|
||||||
errorDisplay.classList.remove('hidden');
|
|
||||||
}
|
|
||||||
|
|
||||||
function displayResults(results) {
|
|
||||||
const container = document.getElementById('resultsContainer');
|
|
||||||
const tbody = document.getElementById('resultsBody');
|
|
||||||
|
|
||||||
tbody.innerHTML = results.map(business => `
|
|
||||||
<tr>
|
|
||||||
<td class="px-6 py-4">
|
|
||||||
<div class="text-sm font-medium text-gray-900">${business.name}</div>
|
|
||||||
<div class="text-sm text-gray-500">${business.description}</div>
|
|
||||||
</td>
|
|
||||||
<td class="px-6 py-4">
|
|
||||||
<div class="text-sm text-gray-900">${business.address}</div>
|
|
||||||
<div class="text-sm text-gray-500">${business.phone}</div>
|
|
||||||
</td>
|
|
||||||
<td class="px-6 py-4">
|
|
||||||
${business.website ?
|
|
||||||
`<a href="${business.website}" target="_blank"
|
|
||||||
class="inline-flex items-center px-3 py-2 border border-transparent text-sm leading-4 font-medium rounded-md text-white bg-primary hover:bg-primary-hover focus:outline-none focus:ring-2 focus:ring-offset-2 focus:ring-primary">
|
|
||||||
Visit Website
|
|
||||||
</a>` :
|
|
||||||
'<span class="text-sm text-gray-500">No website available</span>'
|
|
||||||
}
|
|
||||||
</td>
|
|
||||||
</tr>
|
|
||||||
`).join('');
|
|
||||||
|
|
||||||
container.classList.remove('hidden');
|
|
||||||
}
|
|
||||||
</script>
|
|
||||||
</body>
|
|
||||||
</html>
|
|
@@ -3,12 +3,43 @@ PORT = 3001 # Port to run the server on
|
|||||||
SIMILARITY_MEASURE = "cosine" # "cosine" or "dot"
|
SIMILARITY_MEASURE = "cosine" # "cosine" or "dot"
|
||||||
KEEP_ALIVE = "5m" # How long to keep Ollama models loaded into memory. (Instead of using -1 use "-1m")
|
KEEP_ALIVE = "5m" # How long to keep Ollama models loaded into memory. (Instead of using -1 use "-1m")
|
||||||
|
|
||||||
[API_KEYS]
|
[SEARCH_ENGINE_BACKENDS] # "google" | "searxng" | "bing" | "brave" | "yacy"
|
||||||
OPENAI = "" # OpenAI API key - sk-1234567890abcdef1234567890abcdef
|
SEARCH = "searxng"
|
||||||
GROQ = "" # Groq API key - gsk_1234567890abcdef1234567890abcdef
|
IMAGE = "searxng"
|
||||||
ANTHROPIC = "" # Anthropic API key - sk-ant-1234567890abcdef1234567890abcdef
|
VIDEO = "searxng"
|
||||||
GEMINI = "" # Gemini API key - sk-1234567890abcdef1234567890abcdef
|
NEWS = "searxng"
|
||||||
|
|
||||||
[API_ENDPOINTS]
|
[MODELS.OPENAI]
|
||||||
SEARXNG = "http://localhost:32768" # SearxNG API URL
|
API_KEY = ""
|
||||||
OLLAMA = "" # Ollama API URL - http://host.docker.internal:11434
|
|
||||||
|
[MODELS.GROQ]
|
||||||
|
API_KEY = ""
|
||||||
|
|
||||||
|
[MODELS.ANTHROPIC]
|
||||||
|
API_KEY = ""
|
||||||
|
|
||||||
|
[MODELS.GEMINI]
|
||||||
|
API_KEY = ""
|
||||||
|
|
||||||
|
[MODELS.CUSTOM_OPENAI]
|
||||||
|
API_KEY = ""
|
||||||
|
API_URL = ""
|
||||||
|
|
||||||
|
[MODELS.OLLAMA]
|
||||||
|
API_URL = "" # Ollama API URL - http://host.docker.internal:11434
|
||||||
|
|
||||||
|
[SEARCH_ENGINES.GOOGLE]
|
||||||
|
API_KEY = ""
|
||||||
|
CSE_ID = ""
|
||||||
|
|
||||||
|
[SEARCH_ENGINES.SEARXNG]
|
||||||
|
ENDPOINT = ""
|
||||||
|
|
||||||
|
[SEARCH_ENGINES.BING]
|
||||||
|
SUBSCRIPTION_KEY = ""
|
||||||
|
|
||||||
|
[SEARCH_ENGINES.BRAVE]
|
||||||
|
API_KEY = ""
|
||||||
|
|
||||||
|
[SEARCH_ENGINES.YACY]
|
||||||
|
ENDPOINT = ""
|
||||||
|
@@ -11,49 +11,9 @@ search:
|
|||||||
|
|
||||||
server:
|
server:
|
||||||
secret_key: 'a2fb23f1b02e6ee83875b09826990de0f6bd908b6638e8c10277d415f6ab852b' # Is overwritten by ${SEARXNG_SECRET}
|
secret_key: 'a2fb23f1b02e6ee83875b09826990de0f6bd908b6638e8c10277d415f6ab852b' # Is overwritten by ${SEARXNG_SECRET}
|
||||||
port: 8080
|
|
||||||
bind_address: "0.0.0.0"
|
|
||||||
base_url: http://localhost:8080/
|
|
||||||
|
|
||||||
engines:
|
engines:
|
||||||
- name: wolframalpha
|
- name: wolframalpha
|
||||||
disabled: false
|
disabled: false
|
||||||
|
- name: qwant
|
||||||
- name: google
|
disabled: true
|
||||||
engine: google
|
|
||||||
shortcut: g
|
|
||||||
disabled: false
|
|
||||||
|
|
||||||
- name: bing
|
|
||||||
engine: bing
|
|
||||||
shortcut: b
|
|
||||||
disabled: false
|
|
||||||
|
|
||||||
- name: duckduckgo
|
|
||||||
engine: duckduckgo
|
|
||||||
shortcut: d
|
|
||||||
disabled: false
|
|
||||||
|
|
||||||
- name: yelp
|
|
||||||
engine: yelp
|
|
||||||
shortcut: y
|
|
||||||
disabled: false
|
|
||||||
|
|
||||||
ui:
|
|
||||||
static_path: ""
|
|
||||||
templates_path: ""
|
|
||||||
default_theme: simple
|
|
||||||
default_locale: en
|
|
||||||
results_on_new_tab: false
|
|
||||||
|
|
||||||
outgoing:
|
|
||||||
request_timeout: 6.0
|
|
||||||
max_request_timeout: 10.0
|
|
||||||
pool_connections: 100
|
|
||||||
pool_maxsize: 10
|
|
||||||
enable_http2: true
|
|
||||||
|
|
||||||
server:
|
|
||||||
limiter: false
|
|
||||||
image_proxy: false
|
|
||||||
http_protocol_version: "1.0"
|
|
||||||
|
38
src/app.ts
38
src/app.ts
@@ -1,16 +1,38 @@
|
|||||||
|
import { startWebSocketServer } from './websocket';
|
||||||
import express from 'express';
|
import express from 'express';
|
||||||
import cors from 'cors';
|
import cors from 'cors';
|
||||||
import searchRoutes from './routes/search';
|
import http from 'http';
|
||||||
import businessRoutes from './routes/business';
|
import routes from './routes';
|
||||||
|
import { getPort } from './config';
|
||||||
|
import logger from './utils/logger';
|
||||||
|
|
||||||
|
const port = getPort();
|
||||||
|
|
||||||
const app = express();
|
const app = express();
|
||||||
|
const server = http.createServer(app);
|
||||||
|
|
||||||
// Middleware
|
const corsOptions = {
|
||||||
app.use(cors());
|
origin: '*',
|
||||||
|
};
|
||||||
|
|
||||||
|
app.use(cors(corsOptions));
|
||||||
app.use(express.json());
|
app.use(express.json());
|
||||||
|
|
||||||
// Routes
|
app.use('/api', routes);
|
||||||
app.use('/api/search', searchRoutes);
|
app.get('/api', (_, res) => {
|
||||||
app.use('/api/business', businessRoutes);
|
res.status(200).json({ status: 'ok' });
|
||||||
|
});
|
||||||
|
|
||||||
export default app;
|
server.listen(port, () => {
|
||||||
|
logger.info(`Server is running on port ${port}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
startWebSocketServer(server);
|
||||||
|
|
||||||
|
process.on('uncaughtException', (err, origin) => {
|
||||||
|
logger.error(`Uncaught Exception at ${origin}: ${err}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
process.on('unhandledRejection', (reason, promise) => {
|
||||||
|
logger.error(`Unhandled Rejection at: ${promise}, reason: ${reason}`);
|
||||||
|
});
|
||||||
|
@@ -7,7 +7,12 @@ import { PromptTemplate } from '@langchain/core/prompts';
|
|||||||
import formatChatHistoryAsString from '../utils/formatHistory';
|
import formatChatHistoryAsString from '../utils/formatHistory';
|
||||||
import { BaseMessage } from '@langchain/core/messages';
|
import { BaseMessage } from '@langchain/core/messages';
|
||||||
import { StringOutputParser } from '@langchain/core/output_parsers';
|
import { StringOutputParser } from '@langchain/core/output_parsers';
|
||||||
import { searchSearxng } from '../lib/searxng';
|
import { searchSearxng } from '../lib/searchEngines/searxng';
|
||||||
|
import { searchGooglePSE } from '../lib/searchEngines/google_pse';
|
||||||
|
import { searchBraveAPI } from '../lib/searchEngines/brave';
|
||||||
|
import { searchYaCy } from '../lib/searchEngines/yacy';
|
||||||
|
import { searchBingAPI } from '../lib/searchEngines/bing';
|
||||||
|
import { getImageSearchEngineBackend } from '../config';
|
||||||
import type { BaseChatModel } from '@langchain/core/language_models/chat_models';
|
import type { BaseChatModel } from '@langchain/core/language_models/chat_models';
|
||||||
|
|
||||||
const imageSearchChainPrompt = `
|
const imageSearchChainPrompt = `
|
||||||
@@ -36,6 +41,103 @@ type ImageSearchChainInput = {
|
|||||||
query: string;
|
query: string;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
async function performImageSearch(query: string) {
|
||||||
|
const searchEngine = getImageSearchEngineBackend();
|
||||||
|
let images = [];
|
||||||
|
|
||||||
|
switch (searchEngine) {
|
||||||
|
case 'google': {
|
||||||
|
const googleResult = await searchGooglePSE(query);
|
||||||
|
images = googleResult.results
|
||||||
|
.map((result) => {
|
||||||
|
if (result.img_src && result.url && result.title) {
|
||||||
|
return {
|
||||||
|
img_src: result.img_src,
|
||||||
|
url: result.url,
|
||||||
|
title: result.title,
|
||||||
|
source: result.displayLink,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.filter(Boolean);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'searxng': {
|
||||||
|
const searxResult = await searchSearxng(query, {
|
||||||
|
engines: ['google images', 'bing images'],
|
||||||
|
pageno: 1,
|
||||||
|
});
|
||||||
|
searxResult.results.forEach((result) => {
|
||||||
|
if (result.img_src && result.url && result.title) {
|
||||||
|
images.push({
|
||||||
|
img_src: result.img_src,
|
||||||
|
url: result.url,
|
||||||
|
title: result.title,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'brave': {
|
||||||
|
const braveResult = await searchBraveAPI(query);
|
||||||
|
images = braveResult.results
|
||||||
|
.map((result) => {
|
||||||
|
if (result.img_src && result.url && result.title) {
|
||||||
|
return {
|
||||||
|
img_src: result.img_src,
|
||||||
|
url: result.url,
|
||||||
|
title: result.title,
|
||||||
|
source: result.url,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.filter(Boolean);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'yacy': {
|
||||||
|
const yacyResult = await searchYaCy(query);
|
||||||
|
images = yacyResult.results
|
||||||
|
.map((result) => {
|
||||||
|
if (result.img_src && result.url && result.title) {
|
||||||
|
return {
|
||||||
|
img_src: result.img_src,
|
||||||
|
url: result.url,
|
||||||
|
title: result.title,
|
||||||
|
source: result.url,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.filter(Boolean);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'bing': {
|
||||||
|
const bingResult = await searchBingAPI(query);
|
||||||
|
images = bingResult.results
|
||||||
|
.map((result) => {
|
||||||
|
if (result.img_src && result.url && result.title) {
|
||||||
|
return {
|
||||||
|
img_src: result.img_src,
|
||||||
|
url: result.url,
|
||||||
|
title: result.title,
|
||||||
|
source: result.url,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.filter(Boolean);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
default:
|
||||||
|
throw new Error(`Unknown search engine ${searchEngine}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
return images;
|
||||||
|
}
|
||||||
|
|
||||||
const strParser = new StringOutputParser();
|
const strParser = new StringOutputParser();
|
||||||
|
|
||||||
const createImageSearchChain = (llm: BaseChatModel) => {
|
const createImageSearchChain = (llm: BaseChatModel) => {
|
||||||
@@ -52,22 +154,7 @@ const createImageSearchChain = (llm: BaseChatModel) => {
|
|||||||
llm,
|
llm,
|
||||||
strParser,
|
strParser,
|
||||||
RunnableLambda.from(async (input: string) => {
|
RunnableLambda.from(async (input: string) => {
|
||||||
const res = await searchSearxng(input, {
|
const images = await performImageSearch(input);
|
||||||
engines: ['bing images', 'google images'],
|
|
||||||
});
|
|
||||||
|
|
||||||
const images = [];
|
|
||||||
|
|
||||||
res.results.forEach((result) => {
|
|
||||||
if (result.img_src && result.url && result.title) {
|
|
||||||
images.push({
|
|
||||||
img_src: result.img_src,
|
|
||||||
url: result.url,
|
|
||||||
title: result.title,
|
|
||||||
});
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
return images.slice(0, 10);
|
return images.slice(0, 10);
|
||||||
}),
|
}),
|
||||||
]);
|
]);
|
||||||
|
@@ -7,26 +7,30 @@ import { PromptTemplate } from '@langchain/core/prompts';
|
|||||||
import formatChatHistoryAsString from '../utils/formatHistory';
|
import formatChatHistoryAsString from '../utils/formatHistory';
|
||||||
import { BaseMessage } from '@langchain/core/messages';
|
import { BaseMessage } from '@langchain/core/messages';
|
||||||
import { StringOutputParser } from '@langchain/core/output_parsers';
|
import { StringOutputParser } from '@langchain/core/output_parsers';
|
||||||
import { searchSearxng } from '../lib/searxng';
|
import { searchSearxng } from '../lib/searchEngines/searxng';
|
||||||
|
import { searchGooglePSE } from '../lib/searchEngines/google_pse';
|
||||||
|
import { searchBraveAPI } from '../lib/searchEngines/brave';
|
||||||
|
import { searchBingAPI } from '../lib/searchEngines/bing';
|
||||||
|
import { getVideoSearchEngineBackend } from '../config';
|
||||||
import type { BaseChatModel } from '@langchain/core/language_models/chat_models';
|
import type { BaseChatModel } from '@langchain/core/language_models/chat_models';
|
||||||
|
|
||||||
const VideoSearchChainPrompt = `
|
const VideoSearchChainPrompt = `
|
||||||
You will be given a conversation below and a follow up question. You need to rephrase the follow-up question so it is a standalone question that can be used by the LLM to search Youtube for videos.
|
You will be given a conversation below and a follow up question. You need to rephrase the follow-up question so it is a standalone question that can be used by the LLM to search Youtube for videos.
|
||||||
You need to make sure the rephrased question agrees with the conversation and is relevant to the conversation.
|
You need to make sure the rephrased question agrees with the conversation and is relevant to the conversation.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
1. Follow up question: How does a car work?
|
1. Follow up question: How does a car work?
|
||||||
Rephrased: How does a car work?
|
Rephrased: How does a car work?
|
||||||
|
|
||||||
2. Follow up question: What is the theory of relativity?
|
2. Follow up question: What is the theory of relativity?
|
||||||
Rephrased: What is theory of relativity
|
Rephrased: What is theory of relativity
|
||||||
|
|
||||||
3. Follow up question: How does an AC work?
|
3. Follow up question: How does an AC work?
|
||||||
Rephrased: How does an AC work
|
Rephrased: How does an AC work
|
||||||
|
|
||||||
Conversation:
|
Conversation:
|
||||||
{chat_history}
|
{chat_history}
|
||||||
|
|
||||||
Follow up question: {query}
|
Follow up question: {query}
|
||||||
Rephrased question:
|
Rephrased question:
|
||||||
`;
|
`;
|
||||||
@@ -38,6 +42,102 @@ type VideoSearchChainInput = {
|
|||||||
|
|
||||||
const strParser = new StringOutputParser();
|
const strParser = new StringOutputParser();
|
||||||
|
|
||||||
|
async function performVideoSearch(query: string) {
|
||||||
|
const searchEngine = getVideoSearchEngineBackend();
|
||||||
|
const youtubeQuery = `${query} site:youtube.com`;
|
||||||
|
let videos = [];
|
||||||
|
|
||||||
|
switch (searchEngine) {
|
||||||
|
case 'google': {
|
||||||
|
const googleResult = await searchGooglePSE(youtubeQuery);
|
||||||
|
googleResult.results.forEach((result) => {
|
||||||
|
// Use .results instead of .originalres
|
||||||
|
if (result.img_src && result.url && result.title) {
|
||||||
|
const videoId = new URL(result.url).searchParams.get('v');
|
||||||
|
videos.push({
|
||||||
|
img_src: result.img_src,
|
||||||
|
url: result.url,
|
||||||
|
title: result.title,
|
||||||
|
iframe_src: videoId
|
||||||
|
? `https://www.youtube.com/embed/${videoId}`
|
||||||
|
: null,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'searxng': {
|
||||||
|
const searxResult = await searchSearxng(query, {
|
||||||
|
engines: ['youtube'],
|
||||||
|
});
|
||||||
|
searxResult.results.forEach((result) => {
|
||||||
|
if (
|
||||||
|
result.thumbnail &&
|
||||||
|
result.url &&
|
||||||
|
result.title &&
|
||||||
|
result.iframe_src
|
||||||
|
) {
|
||||||
|
videos.push({
|
||||||
|
img_src: result.thumbnail,
|
||||||
|
url: result.url,
|
||||||
|
title: result.title,
|
||||||
|
iframe_src: result.iframe_src,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'brave': {
|
||||||
|
const braveResult = await searchBraveAPI(youtubeQuery);
|
||||||
|
braveResult.results.forEach((result) => {
|
||||||
|
if (result.img_src && result.url && result.title) {
|
||||||
|
const videoId = new URL(result.url).searchParams.get('v');
|
||||||
|
videos.push({
|
||||||
|
img_src: result.img_src,
|
||||||
|
url: result.url,
|
||||||
|
title: result.title,
|
||||||
|
iframe_src: videoId
|
||||||
|
? `https://www.youtube.com/embed/${videoId}`
|
||||||
|
: null,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'yacy': {
|
||||||
|
console.log('Not available for yacy');
|
||||||
|
videos = [];
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'bing': {
|
||||||
|
const bingResult = await searchBingAPI(youtubeQuery);
|
||||||
|
bingResult.results.forEach((result) => {
|
||||||
|
if (result.img_src && result.url && result.title) {
|
||||||
|
const videoId = new URL(result.url).searchParams.get('v');
|
||||||
|
videos.push({
|
||||||
|
img_src: result.img_src,
|
||||||
|
url: result.url,
|
||||||
|
title: result.title,
|
||||||
|
iframe_src: videoId
|
||||||
|
? `https://www.youtube.com/embed/${videoId}`
|
||||||
|
: null,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
default:
|
||||||
|
throw new Error(`Unknown search engine ${searchEngine}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
return videos;
|
||||||
|
}
|
||||||
|
|
||||||
const createVideoSearchChain = (llm: BaseChatModel) => {
|
const createVideoSearchChain = (llm: BaseChatModel) => {
|
||||||
return RunnableSequence.from([
|
return RunnableSequence.from([
|
||||||
RunnableMap.from({
|
RunnableMap.from({
|
||||||
@@ -52,28 +152,7 @@ const createVideoSearchChain = (llm: BaseChatModel) => {
|
|||||||
llm,
|
llm,
|
||||||
strParser,
|
strParser,
|
||||||
RunnableLambda.from(async (input: string) => {
|
RunnableLambda.from(async (input: string) => {
|
||||||
const res = await searchSearxng(input, {
|
const videos = await performVideoSearch(input);
|
||||||
engines: ['youtube'],
|
|
||||||
});
|
|
||||||
|
|
||||||
const videos = [];
|
|
||||||
|
|
||||||
res.results.forEach((result) => {
|
|
||||||
if (
|
|
||||||
result.thumbnail &&
|
|
||||||
result.url &&
|
|
||||||
result.title &&
|
|
||||||
result.iframe_src
|
|
||||||
) {
|
|
||||||
videos.push({
|
|
||||||
img_src: result.thumbnail,
|
|
||||||
url: result.url,
|
|
||||||
title: result.title,
|
|
||||||
iframe_src: result.iframe_src,
|
|
||||||
});
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
return videos.slice(0, 10);
|
return videos.slice(0, 10);
|
||||||
}),
|
}),
|
||||||
]);
|
]);
|
||||||
|
158
src/config.ts
158
src/config.ts
@@ -10,15 +10,51 @@ interface Config {
|
|||||||
SIMILARITY_MEASURE: string;
|
SIMILARITY_MEASURE: string;
|
||||||
KEEP_ALIVE: string;
|
KEEP_ALIVE: string;
|
||||||
};
|
};
|
||||||
API_KEYS: {
|
SEARCH_ENGINE_BACKENDS: {
|
||||||
OPENAI: string;
|
SEARCH: string;
|
||||||
GROQ: string;
|
IMAGE: string;
|
||||||
ANTHROPIC: string;
|
VIDEO: string;
|
||||||
GEMINI: string;
|
NEWS: string;
|
||||||
};
|
};
|
||||||
API_ENDPOINTS: {
|
MODELS: {
|
||||||
SEARXNG: string;
|
OPENAI: {
|
||||||
OLLAMA: string;
|
API_KEY: string;
|
||||||
|
};
|
||||||
|
GROQ: {
|
||||||
|
API_KEY: string;
|
||||||
|
};
|
||||||
|
ANTHROPIC: {
|
||||||
|
API_KEY: string;
|
||||||
|
};
|
||||||
|
GEMINI: {
|
||||||
|
API_KEY: string;
|
||||||
|
};
|
||||||
|
OLLAMA: {
|
||||||
|
API_URL: string;
|
||||||
|
};
|
||||||
|
CUSTOM_OPENAI: {
|
||||||
|
API_URL: string;
|
||||||
|
API_KEY: string;
|
||||||
|
MODEL_NAME: string;
|
||||||
|
};
|
||||||
|
};
|
||||||
|
SEARCH_ENGINES: {
|
||||||
|
GOOGLE: {
|
||||||
|
API_KEY: string;
|
||||||
|
CSE_ID: string;
|
||||||
|
};
|
||||||
|
SEARXNG: {
|
||||||
|
ENDPOINT: string;
|
||||||
|
};
|
||||||
|
BING: {
|
||||||
|
SUBSCRIPTION_KEY: string;
|
||||||
|
};
|
||||||
|
BRAVE: {
|
||||||
|
API_KEY: string;
|
||||||
|
};
|
||||||
|
YACY: {
|
||||||
|
ENDPOINT: string;
|
||||||
|
};
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -38,55 +74,89 @@ export const getSimilarityMeasure = () =>
|
|||||||
|
|
||||||
export const getKeepAlive = () => loadConfig().GENERAL.KEEP_ALIVE;
|
export const getKeepAlive = () => loadConfig().GENERAL.KEEP_ALIVE;
|
||||||
|
|
||||||
export const getOpenaiApiKey = () => loadConfig().API_KEYS.OPENAI;
|
export const getOpenaiApiKey = () => loadConfig().MODELS.OPENAI.API_KEY;
|
||||||
|
|
||||||
export const getGroqApiKey = () => loadConfig().API_KEYS.GROQ;
|
export const getGroqApiKey = () => loadConfig().MODELS.GROQ.API_KEY;
|
||||||
|
|
||||||
export const getAnthropicApiKey = () => loadConfig().API_KEYS.ANTHROPIC;
|
export const getAnthropicApiKey = () => loadConfig().MODELS.ANTHROPIC.API_KEY;
|
||||||
|
|
||||||
export const getGeminiApiKey = () => loadConfig().API_KEYS.GEMINI;
|
export const getGeminiApiKey = () => loadConfig().MODELS.GEMINI.API_KEY;
|
||||||
|
|
||||||
|
export const getSearchEngineBackend = () =>
|
||||||
|
loadConfig().SEARCH_ENGINE_BACKENDS.SEARCH;
|
||||||
|
|
||||||
|
export const getImageSearchEngineBackend = () =>
|
||||||
|
loadConfig().SEARCH_ENGINE_BACKENDS.IMAGE || getSearchEngineBackend();
|
||||||
|
|
||||||
|
export const getVideoSearchEngineBackend = () =>
|
||||||
|
loadConfig().SEARCH_ENGINE_BACKENDS.VIDEO || getSearchEngineBackend();
|
||||||
|
|
||||||
|
export const getNewsSearchEngineBackend = () =>
|
||||||
|
loadConfig().SEARCH_ENGINE_BACKENDS.NEWS || getSearchEngineBackend();
|
||||||
|
|
||||||
|
export const getGoogleApiKey = () => loadConfig().SEARCH_ENGINES.GOOGLE.API_KEY;
|
||||||
|
|
||||||
|
export const getGoogleCseId = () => loadConfig().SEARCH_ENGINES.GOOGLE.CSE_ID;
|
||||||
|
|
||||||
|
export const getBraveApiKey = () => loadConfig().SEARCH_ENGINES.BRAVE.API_KEY;
|
||||||
|
|
||||||
|
export const getBingSubscriptionKey = () =>
|
||||||
|
loadConfig().SEARCH_ENGINES.BING.SUBSCRIPTION_KEY;
|
||||||
|
|
||||||
|
export const getYacyJsonEndpoint = () =>
|
||||||
|
loadConfig().SEARCH_ENGINES.YACY.ENDPOINT;
|
||||||
|
|
||||||
export const getSearxngApiEndpoint = () =>
|
export const getSearxngApiEndpoint = () =>
|
||||||
process.env.SEARXNG_API_URL || loadConfig().API_ENDPOINTS.SEARXNG;
|
process.env.SEARXNG_API_URL || loadConfig().SEARCH_ENGINES.SEARXNG.ENDPOINT;
|
||||||
|
|
||||||
export const getOllamaApiEndpoint = () => loadConfig().API_ENDPOINTS.OLLAMA;
|
export const getOllamaApiEndpoint = () => loadConfig().MODELS.OLLAMA.API_URL;
|
||||||
|
|
||||||
export const updateConfig = (config: RecursivePartial<Config>) => {
|
export const getCustomOpenaiApiKey = () =>
|
||||||
const currentConfig = loadConfig();
|
loadConfig().MODELS.CUSTOM_OPENAI.API_KEY;
|
||||||
|
|
||||||
for (const key in currentConfig) {
|
export const getCustomOpenaiApiUrl = () =>
|
||||||
if (!config[key]) config[key] = {};
|
loadConfig().MODELS.CUSTOM_OPENAI.API_URL;
|
||||||
|
|
||||||
if (typeof currentConfig[key] === 'object' && currentConfig[key] !== null) {
|
export const getCustomOpenaiModelName = () =>
|
||||||
for (const nestedKey in currentConfig[key]) {
|
loadConfig().MODELS.CUSTOM_OPENAI.MODEL_NAME;
|
||||||
if (
|
|
||||||
!config[key][nestedKey] &&
|
const mergeConfigs = (current: any, update: any): any => {
|
||||||
currentConfig[key][nestedKey] &&
|
if (update === null || update === undefined) {
|
||||||
config[key][nestedKey] !== ''
|
return current;
|
||||||
) {
|
}
|
||||||
config[key][nestedKey] = currentConfig[key][nestedKey];
|
|
||||||
}
|
if (typeof current !== 'object' || current === null) {
|
||||||
|
return update;
|
||||||
|
}
|
||||||
|
|
||||||
|
const result = { ...current };
|
||||||
|
|
||||||
|
for (const key in update) {
|
||||||
|
if (Object.prototype.hasOwnProperty.call(update, key)) {
|
||||||
|
const updateValue = update[key];
|
||||||
|
|
||||||
|
if (
|
||||||
|
typeof updateValue === 'object' &&
|
||||||
|
updateValue !== null &&
|
||||||
|
typeof result[key] === 'object' &&
|
||||||
|
result[key] !== null
|
||||||
|
) {
|
||||||
|
result[key] = mergeConfigs(result[key], updateValue);
|
||||||
|
} else if (updateValue !== undefined) {
|
||||||
|
result[key] = updateValue;
|
||||||
}
|
}
|
||||||
} else if (currentConfig[key] && config[key] !== '') {
|
|
||||||
config[key] = currentConfig[key];
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
fs.writeFileSync(
|
return result;
|
||||||
path.join(__dirname, `../${configFileName}`),
|
|
||||||
toml.stringify(config),
|
|
||||||
);
|
|
||||||
};
|
};
|
||||||
|
|
||||||
export const config = {
|
export const updateConfig = (config: RecursivePartial<Config>) => {
|
||||||
ollama: {
|
const currentConfig = loadConfig();
|
||||||
url: process.env.OLLAMA_URL || 'http://localhost:11434',
|
const mergedConfig = mergeConfigs(currentConfig, config);
|
||||||
model: process.env.OLLAMA_MODEL || 'mistral',
|
|
||||||
options: {
|
fs.writeFileSync(
|
||||||
temperature: 0.1,
|
path.join(__dirname, `../${configFileName}`),
|
||||||
top_p: 0.9,
|
toml.stringify(mergedConfig),
|
||||||
timeout: 30000 // 30 seconds timeout
|
);
|
||||||
}
|
|
||||||
},
|
|
||||||
// ... other config
|
|
||||||
};
|
};
|
||||||
|
@@ -1,40 +0,0 @@
|
|||||||
import dotenv from 'dotenv';
|
|
||||||
|
|
||||||
// Load environment variables
|
|
||||||
dotenv.config();
|
|
||||||
|
|
||||||
// Environment configuration
|
|
||||||
const env = {
|
|
||||||
// Supabase Configuration
|
|
||||||
SUPABASE_URL: process.env.SUPABASE_URL || '',
|
|
||||||
SUPABASE_KEY: process.env.SUPABASE_KEY || '',
|
|
||||||
|
|
||||||
// Server Configuration
|
|
||||||
PORT: parseInt(process.env.PORT || '3001', 10),
|
|
||||||
NODE_ENV: process.env.NODE_ENV || 'development',
|
|
||||||
|
|
||||||
// Search Configuration
|
|
||||||
MAX_RESULTS_PER_QUERY: parseInt(process.env.MAX_RESULTS_PER_QUERY || '50', 10),
|
|
||||||
CACHE_DURATION_HOURS: parseInt(process.env.CACHE_DURATION_HOURS || '24', 10),
|
|
||||||
CACHE_DURATION_DAYS: parseInt(process.env.CACHE_DURATION_DAYS || '7', 10),
|
|
||||||
|
|
||||||
// SearxNG Configuration
|
|
||||||
SEARXNG_URL: process.env.SEARXNG_URL || 'http://localhost:4000',
|
|
||||||
|
|
||||||
// Ollama Configuration
|
|
||||||
OLLAMA_URL: process.env.OLLAMA_URL || 'http://localhost:11434',
|
|
||||||
OLLAMA_MODEL: process.env.OLLAMA_MODEL || 'deepseek-coder:6.7b',
|
|
||||||
|
|
||||||
// Hugging Face Configuration
|
|
||||||
HUGGING_FACE_API_KEY: process.env.HUGGING_FACE_API_KEY || ''
|
|
||||||
};
|
|
||||||
|
|
||||||
// Validate required environment variables
|
|
||||||
const requiredEnvVars = ['SUPABASE_URL', 'SUPABASE_KEY', 'SEARXNG_URL'];
|
|
||||||
for (const envVar of requiredEnvVars) {
|
|
||||||
if (!env[envVar as keyof typeof env]) {
|
|
||||||
throw new Error(`Missing required environment variable: ${envVar}`);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export { env };
|
|
@@ -1,77 +0,0 @@
|
|||||||
import dotenv from 'dotenv';
|
|
||||||
import path from 'path';
|
|
||||||
|
|
||||||
// Load .env file
|
|
||||||
dotenv.config({ path: path.resolve(__dirname, '../../.env') });
|
|
||||||
|
|
||||||
export interface Config {
|
|
||||||
supabase: {
|
|
||||||
url: string;
|
|
||||||
anonKey: string;
|
|
||||||
};
|
|
||||||
server: {
|
|
||||||
port: number;
|
|
||||||
nodeEnv: string;
|
|
||||||
};
|
|
||||||
search: {
|
|
||||||
maxResultsPerQuery: number;
|
|
||||||
cacheDurationHours: number;
|
|
||||||
searxngUrl?: string;
|
|
||||||
};
|
|
||||||
rateLimit: {
|
|
||||||
windowMs: number;
|
|
||||||
maxRequests: number;
|
|
||||||
};
|
|
||||||
security: {
|
|
||||||
corsOrigin: string;
|
|
||||||
jwtSecret: string;
|
|
||||||
};
|
|
||||||
proxy?: {
|
|
||||||
http?: string;
|
|
||||||
https?: string;
|
|
||||||
};
|
|
||||||
logging: {
|
|
||||||
level: string;
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
const config: Config = {
|
|
||||||
supabase: {
|
|
||||||
url: process.env.SUPABASE_URL || '',
|
|
||||||
anonKey: process.env.SUPABASE_ANON_KEY || '',
|
|
||||||
},
|
|
||||||
server: {
|
|
||||||
port: parseInt(process.env.PORT || '3000', 10),
|
|
||||||
nodeEnv: process.env.NODE_ENV || 'development',
|
|
||||||
},
|
|
||||||
search: {
|
|
||||||
maxResultsPerQuery: parseInt(process.env.MAX_RESULTS_PER_QUERY || '20', 10),
|
|
||||||
cacheDurationHours: parseInt(process.env.CACHE_DURATION_HOURS || '24', 10),
|
|
||||||
searxngUrl: process.env.SEARXNG_URL
|
|
||||||
},
|
|
||||||
rateLimit: {
|
|
||||||
windowMs: parseInt(process.env.RATE_LIMIT_WINDOW_MS || '900000', 10),
|
|
||||||
maxRequests: parseInt(process.env.RATE_LIMIT_MAX_REQUESTS || '100', 10),
|
|
||||||
},
|
|
||||||
security: {
|
|
||||||
corsOrigin: process.env.CORS_ORIGIN || 'http://localhost:3000',
|
|
||||||
jwtSecret: process.env.JWT_SECRET || 'your_jwt_secret_key',
|
|
||||||
},
|
|
||||||
logging: {
|
|
||||||
level: process.env.LOG_LEVEL || 'info',
|
|
||||||
},
|
|
||||||
};
|
|
||||||
|
|
||||||
// Validate required configuration
|
|
||||||
const validateConfig = () => {
|
|
||||||
if (!config.supabase.url) {
|
|
||||||
throw new Error('SUPABASE_URL is required');
|
|
||||||
}
|
|
||||||
if (!config.supabase.anonKey) {
|
|
||||||
throw new Error('SUPABASE_ANON_KEY is required');
|
|
||||||
}
|
|
||||||
};
|
|
||||||
|
|
||||||
validateConfig();
|
|
||||||
|
|
||||||
export { config };
|
|
24
src/index.ts
24
src/index.ts
@@ -1,24 +0,0 @@
|
|||||||
import './config/env'; // Load environment variables first
|
|
||||||
import { startServer } from './server';
|
|
||||||
import { isPortAvailable } from './utils/portCheck';
|
|
||||||
import { testConnection } from './lib/supabase';
|
|
||||||
|
|
||||||
const PORT = process.env.PORT || 3001;
|
|
||||||
|
|
||||||
const init = async () => {
|
|
||||||
if (!await isPortAvailable(PORT)) {
|
|
||||||
console.error(`Port ${PORT} is in use. Please try a different port or free up the current one.`);
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Test Supabase connection
|
|
||||||
const isConnected = await testConnection();
|
|
||||||
if (!isConnected) {
|
|
||||||
console.error('Failed to connect to Supabase. Please check your configuration.');
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
|
|
||||||
startServer();
|
|
||||||
};
|
|
||||||
|
|
||||||
init().catch(console.error);
|
|
@@ -1,116 +0,0 @@
|
|||||||
export interface Category {
|
|
||||||
id: string;
|
|
||||||
name: string;
|
|
||||||
icon: string;
|
|
||||||
subcategories: SubCategory[];
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface SubCategory {
|
|
||||||
id: string;
|
|
||||||
name: string;
|
|
||||||
}
|
|
||||||
|
|
||||||
export const categories: Category[] = [
|
|
||||||
{
|
|
||||||
id: 'real-estate-pros',
|
|
||||||
name: 'Real Estate Professionals',
|
|
||||||
icon: '🏢',
|
|
||||||
subcategories: [
|
|
||||||
{ id: 'wholesalers', name: 'Real Estate Wholesalers' },
|
|
||||||
{ id: 'agents', name: 'Real Estate Agents' },
|
|
||||||
{ id: 'attorneys', name: 'Real Estate Attorneys' },
|
|
||||||
{ id: 'scouts', name: 'Property Scouts' },
|
|
||||||
{ id: 'brokers', name: 'Real Estate Brokers' },
|
|
||||||
{ id: 'consultants', name: 'Real Estate Consultants' }
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
id: 'legal-title',
|
|
||||||
name: 'Legal & Title Services',
|
|
||||||
icon: '⚖️',
|
|
||||||
subcategories: [
|
|
||||||
{ id: 'title-companies', name: 'Title Companies' },
|
|
||||||
{ id: 'closing-attorneys', name: 'Closing Attorneys' },
|
|
||||||
{ id: 'zoning-consultants', name: 'Zoning Consultants' },
|
|
||||||
{ id: 'probate-specialists', name: 'Probate Specialists' },
|
|
||||||
{ id: 'eviction-specialists', name: 'Eviction Specialists' }
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
id: 'financial',
|
|
||||||
name: 'Financial Services',
|
|
||||||
icon: '💰',
|
|
||||||
subcategories: [
|
|
||||||
{ id: 'hard-money', name: 'Hard Money Lenders' },
|
|
||||||
{ id: 'private-equity', name: 'Private Equity Investors' },
|
|
||||||
{ id: 'mortgage-brokers', name: 'Mortgage Brokers' },
|
|
||||||
{ id: 'tax-advisors', name: 'Tax Advisors' },
|
|
||||||
{ id: 'appraisers', name: 'Appraisers' }
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
id: 'contractors',
|
|
||||||
name: 'Specialist Contractors',
|
|
||||||
icon: '🔨',
|
|
||||||
subcategories: [
|
|
||||||
{ id: 'general', name: 'General Contractors' },
|
|
||||||
{ id: 'plumbers', name: 'Plumbers' },
|
|
||||||
{ id: 'electricians', name: 'Electricians' },
|
|
||||||
{ id: 'hvac', name: 'HVAC Technicians' },
|
|
||||||
{ id: 'roofers', name: 'Roofers' },
|
|
||||||
{ id: 'foundation', name: 'Foundation Specialists' },
|
|
||||||
{ id: 'asbestos', name: 'Asbestos Removal' },
|
|
||||||
{ id: 'mold', name: 'Mold Remediation' }
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
id: 'property-services',
|
|
||||||
name: 'Property Services',
|
|
||||||
icon: '🏠',
|
|
||||||
subcategories: [
|
|
||||||
{ id: 'surveyors', name: 'Surveyors' },
|
|
||||||
{ id: 'inspectors', name: 'Inspectors' },
|
|
||||||
{ id: 'property-managers', name: 'Property Managers' },
|
|
||||||
{ id: 'environmental', name: 'Environmental Consultants' },
|
|
||||||
{ id: 'junk-removal', name: 'Junk Removal Services' },
|
|
||||||
{ id: 'cleaning', name: 'Property Cleaning' }
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
id: 'marketing',
|
|
||||||
name: 'Marketing & Lead Gen',
|
|
||||||
icon: '📢',
|
|
||||||
subcategories: [
|
|
||||||
{ id: 'direct-mail', name: 'Direct Mail Services' },
|
|
||||||
{ id: 'social-media', name: 'Social Media Marketing' },
|
|
||||||
{ id: 'seo', name: 'SEO Specialists' },
|
|
||||||
{ id: 'ppc', name: 'PPC Advertising' },
|
|
||||||
{ id: 'lead-gen', name: 'Lead Generation' },
|
|
||||||
{ id: 'skip-tracing', name: 'Skip Tracing Services' }
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
id: 'data-tech',
|
|
||||||
name: 'Data & Technology',
|
|
||||||
icon: '💻',
|
|
||||||
subcategories: [
|
|
||||||
{ id: 'data-providers', name: 'Property Data Providers' },
|
|
||||||
{ id: 'crm', name: 'CRM Systems' },
|
|
||||||
{ id: 'valuation', name: 'Valuation Tools' },
|
|
||||||
{ id: 'virtual-tours', name: 'Virtual Tour Services' },
|
|
||||||
{ id: 'automation', name: 'Automation Tools' }
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
id: 'specialty',
|
|
||||||
name: 'Specialty Services',
|
|
||||||
icon: '🎯',
|
|
||||||
subcategories: [
|
|
||||||
{ id: 'auction', name: 'Auction Companies' },
|
|
||||||
{ id: 'relocation', name: 'Relocation Services' },
|
|
||||||
{ id: 'staging', name: 'Home Staging' },
|
|
||||||
{ id: 'photography', name: 'Real Estate Photography' },
|
|
||||||
{ id: 'virtual-assistant', name: 'Virtual Assistants' }
|
|
||||||
]
|
|
||||||
}
|
|
||||||
];
|
|
@@ -1,51 +0,0 @@
|
|||||||
import { Database } from 'better-sqlite3';
|
|
||||||
import path from 'path';
|
|
||||||
|
|
||||||
interface OptOutEntry {
|
|
||||||
domain: string;
|
|
||||||
email: string;
|
|
||||||
reason?: string;
|
|
||||||
timestamp: Date;
|
|
||||||
}
|
|
||||||
|
|
||||||
export class OptOutDatabase {
|
|
||||||
private db: Database;
|
|
||||||
|
|
||||||
constructor() {
|
|
||||||
this.db = new Database(path.join(__dirname, '../../../data/optout.db'));
|
|
||||||
this.initializeDatabase();
|
|
||||||
}
|
|
||||||
|
|
||||||
private initializeDatabase() {
|
|
||||||
this.db.exec(`
|
|
||||||
CREATE TABLE IF NOT EXISTS opt_outs (
|
|
||||||
domain TEXT PRIMARY KEY,
|
|
||||||
email TEXT NOT NULL,
|
|
||||||
reason TEXT,
|
|
||||||
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
|
|
||||||
);
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_domain ON opt_outs(domain);
|
|
||||||
`);
|
|
||||||
}
|
|
||||||
|
|
||||||
async addOptOut(entry: OptOutEntry): Promise<void> {
|
|
||||||
const stmt = this.db.prepare(
|
|
||||||
'INSERT OR REPLACE INTO opt_outs (domain, email, reason, timestamp) VALUES (?, ?, ?, ?)'
|
|
||||||
);
|
|
||||||
stmt.run(entry.domain, entry.email, entry.reason, entry.timestamp.toISOString());
|
|
||||||
}
|
|
||||||
|
|
||||||
isOptedOut(domain: string): boolean {
|
|
||||||
const stmt = this.db.prepare('SELECT 1 FROM opt_outs WHERE domain = ?');
|
|
||||||
return stmt.get(domain) !== undefined;
|
|
||||||
}
|
|
||||||
|
|
||||||
removeOptOut(domain: string): void {
|
|
||||||
const stmt = this.db.prepare('DELETE FROM opt_outs WHERE domain = ?');
|
|
||||||
stmt.run(domain);
|
|
||||||
}
|
|
||||||
|
|
||||||
getOptOutList(): OptOutEntry[] {
|
|
||||||
return this.db.prepare('SELECT * FROM opt_outs').all();
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,74 +0,0 @@
|
|||||||
import { createClient } from '@supabase/supabase-js';
|
|
||||||
import { BusinessData } from '../searxng';
|
|
||||||
import { env } from '../../config/env';
|
|
||||||
|
|
||||||
// Create the Supabase client with validated environment variables
|
|
||||||
export const supabase = createClient(
|
|
||||||
env.supabase.url,
|
|
||||||
env.supabase.anonKey,
|
|
||||||
{
|
|
||||||
auth: {
|
|
||||||
persistSession: false // Since this is a server environment
|
|
||||||
}
|
|
||||||
}
|
|
||||||
);
|
|
||||||
|
|
||||||
// Define the cache record type
|
|
||||||
export interface CacheRecord {
|
|
||||||
id: string;
|
|
||||||
query: string;
|
|
||||||
results: BusinessData[];
|
|
||||||
location: string;
|
|
||||||
category: string;
|
|
||||||
created_at: string;
|
|
||||||
updated_at: string;
|
|
||||||
expires_at: string;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Export database helper functions
|
|
||||||
export async function getCacheEntry(
|
|
||||||
category: string,
|
|
||||||
location: string
|
|
||||||
): Promise<CacheRecord | null> {
|
|
||||||
const { data, error } = await supabase
|
|
||||||
.from('search_cache')
|
|
||||||
.select('*')
|
|
||||||
.eq('category', category.toLowerCase())
|
|
||||||
.eq('location', location.toLowerCase())
|
|
||||||
.gt('expires_at', new Date().toISOString())
|
|
||||||
.order('created_at', { ascending: false })
|
|
||||||
.limit(1)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (error) {
|
|
||||||
console.error('Cache lookup failed:', error);
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
return data;
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function saveCacheEntry(
|
|
||||||
category: string,
|
|
||||||
location: string,
|
|
||||||
results: BusinessData[],
|
|
||||||
expiresInDays: number = 7
|
|
||||||
): Promise<void> {
|
|
||||||
const expiresAt = new Date();
|
|
||||||
expiresAt.setDate(expiresAt.getDate() + expiresInDays);
|
|
||||||
|
|
||||||
const { error } = await supabase
|
|
||||||
.from('search_cache')
|
|
||||||
.insert({
|
|
||||||
query: `${category} in ${location}`,
|
|
||||||
category: category.toLowerCase(),
|
|
||||||
location: location.toLowerCase(),
|
|
||||||
results,
|
|
||||||
expires_at: expiresAt.toISOString()
|
|
||||||
});
|
|
||||||
|
|
||||||
if (error) {
|
|
||||||
console.error('Failed to save cache entry:', error);
|
|
||||||
throw error;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,195 +0,0 @@
|
|||||||
import axios from 'axios';
|
|
||||||
import * as cheerio from 'cheerio';
|
|
||||||
import { Cache } from './utils/cache';
|
|
||||||
import { RateLimiter } from './utils/rateLimiter';
|
|
||||||
import robotsParser from 'robots-parser';
|
|
||||||
|
|
||||||
interface ScrapingResult {
|
|
||||||
emails: string[];
|
|
||||||
phones: string[];
|
|
||||||
addresses: string[];
|
|
||||||
socialLinks: string[];
|
|
||||||
source: string;
|
|
||||||
timestamp: Date;
|
|
||||||
attribution: string;
|
|
||||||
}
|
|
||||||
|
|
||||||
export class EmailScraper {
|
|
||||||
private cache: Cache<ScrapingResult>;
|
|
||||||
private rateLimiter: RateLimiter;
|
|
||||||
private robotsCache = new Map<string, any>();
|
|
||||||
|
|
||||||
constructor(private options = {
|
|
||||||
timeout: 5000,
|
|
||||||
cacheTTL: 60,
|
|
||||||
rateLimit: { windowMs: 60000, maxRequests: 10 }, // More conservative rate limiting
|
|
||||||
userAgent: 'BizSearch/1.0 (+https://your-domain.com/about) - Business Directory Service'
|
|
||||||
}) {
|
|
||||||
this.cache = new Cache<ScrapingResult>(options.cacheTTL);
|
|
||||||
this.rateLimiter = new RateLimiter(options.rateLimit.windowMs, options.rateLimit.maxRequests);
|
|
||||||
}
|
|
||||||
|
|
||||||
private async checkRobotsPermission(url: string): Promise<boolean> {
|
|
||||||
try {
|
|
||||||
const { protocol, host } = new URL(url);
|
|
||||||
const robotsUrl = `${protocol}//${host}/robots.txt`;
|
|
||||||
|
|
||||||
let parser = this.robotsCache.get(host);
|
|
||||||
if (!parser) {
|
|
||||||
const response = await axios.get(robotsUrl);
|
|
||||||
parser = robotsParser(robotsUrl, response.data);
|
|
||||||
this.robotsCache.set(host, parser);
|
|
||||||
}
|
|
||||||
|
|
||||||
return parser.isAllowed(url, this.options.userAgent);
|
|
||||||
} catch (error) {
|
|
||||||
console.warn(`Could not check robots.txt for ${url}:`, error);
|
|
||||||
return true; // Assume allowed if robots.txt is unavailable
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async scrapeEmails(url: string): Promise<ScrapingResult> {
|
|
||||||
// Check cache first
|
|
||||||
const cached = this.cache.get(url);
|
|
||||||
if (cached) return cached;
|
|
||||||
|
|
||||||
// Check robots.txt
|
|
||||||
const allowed = await this.checkRobotsPermission(url);
|
|
||||||
if (!allowed) {
|
|
||||||
console.log(`Respecting robots.txt disallow for ${url}`);
|
|
||||||
return {
|
|
||||||
emails: [],
|
|
||||||
phones: [],
|
|
||||||
addresses: [],
|
|
||||||
socialLinks: [],
|
|
||||||
source: url,
|
|
||||||
timestamp: new Date(),
|
|
||||||
attribution: 'Restricted by robots.txt'
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
// Wait for rate limiting slot
|
|
||||||
await this.rateLimiter.waitForSlot();
|
|
||||||
|
|
||||||
try {
|
|
||||||
const response = await axios.get(url, {
|
|
||||||
timeout: this.options.timeout,
|
|
||||||
headers: {
|
|
||||||
'User-Agent': this.options.userAgent,
|
|
||||||
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Check for noindex meta tag
|
|
||||||
const $ = cheerio.load(response.data);
|
|
||||||
if ($('meta[name="robots"][content*="noindex"]').length > 0) {
|
|
||||||
return {
|
|
||||||
emails: [],
|
|
||||||
phones: [],
|
|
||||||
addresses: [],
|
|
||||||
socialLinks: [],
|
|
||||||
source: url,
|
|
||||||
timestamp: new Date(),
|
|
||||||
attribution: 'Respecting noindex directive'
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
// Only extract contact information from public contact pages or structured data
|
|
||||||
const isContactPage = /contact|about/i.test(url) ||
|
|
||||||
$('h1, h2').text().toLowerCase().includes('contact');
|
|
||||||
|
|
||||||
const result = {
|
|
||||||
emails: new Set<string>(),
|
|
||||||
phones: new Set<string>(),
|
|
||||||
addresses: new Set<string>(),
|
|
||||||
socialLinks: new Set<string>(),
|
|
||||||
source: url,
|
|
||||||
timestamp: new Date(),
|
|
||||||
attribution: `Data from public business listing at ${new URL(url).hostname}`
|
|
||||||
};
|
|
||||||
|
|
||||||
// Extract from structured data (Schema.org)
|
|
||||||
$('script[type="application/ld+json"]').each((_, element) => {
|
|
||||||
try {
|
|
||||||
const data = JSON.parse($(element).html() || '{}');
|
|
||||||
if (data['@type'] === 'LocalBusiness' || data['@type'] === 'Organization') {
|
|
||||||
if (data.email) result.emails.add(data.email.toLowerCase());
|
|
||||||
if (data.telephone) result.phones.add(this.formatPhoneNumber(data.telephone));
|
|
||||||
if (data.address) {
|
|
||||||
const fullAddress = this.formatAddress(data.address);
|
|
||||||
if (fullAddress) result.addresses.add(fullAddress);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Error parsing JSON-LD:', e);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Only scrape additional info if it's a contact page
|
|
||||||
if (isContactPage) {
|
|
||||||
// Extract clearly marked contact information
|
|
||||||
$('[itemprop="email"], .contact-email, .email').each((_, element) => {
|
|
||||||
const email = $(element).text().trim();
|
|
||||||
if (this.isValidEmail(email)) {
|
|
||||||
result.emails.add(email.toLowerCase());
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
$('[itemprop="telephone"], .phone, .contact-phone').each((_, element) => {
|
|
||||||
const phone = $(element).text().trim();
|
|
||||||
const formatted = this.formatPhoneNumber(phone);
|
|
||||||
if (formatted) result.phones.add(formatted);
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
const finalResult = {
|
|
||||||
...result,
|
|
||||||
emails: Array.from(result.emails),
|
|
||||||
phones: Array.from(result.phones),
|
|
||||||
addresses: Array.from(result.addresses),
|
|
||||||
socialLinks: Array.from(result.socialLinks)
|
|
||||||
};
|
|
||||||
|
|
||||||
this.cache.set(url, finalResult);
|
|
||||||
return finalResult;
|
|
||||||
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`Failed to scrape ${url}:`, error);
|
|
||||||
return {
|
|
||||||
emails: [],
|
|
||||||
phones: [],
|
|
||||||
addresses: [],
|
|
||||||
socialLinks: [],
|
|
||||||
source: url,
|
|
||||||
timestamp: new Date(),
|
|
||||||
attribution: 'Error accessing page'
|
|
||||||
};
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private isValidEmail(email: string): boolean {
|
|
||||||
return /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/.test(email);
|
|
||||||
}
|
|
||||||
|
|
||||||
private formatPhoneNumber(phone: string): string {
|
|
||||||
const digits = phone.replace(/\D/g, '');
|
|
||||||
if (digits.length === 10) {
|
|
||||||
return `(${digits.slice(0,3)}) ${digits.slice(3,6)}-${digits.slice(6)}`;
|
|
||||||
}
|
|
||||||
return phone;
|
|
||||||
}
|
|
||||||
|
|
||||||
private formatAddress(address: any): string | null {
|
|
||||||
if (typeof address === 'string') return address;
|
|
||||||
if (typeof address === 'object') {
|
|
||||||
const parts = [
|
|
||||||
address.streetAddress,
|
|
||||||
address.addressLocality,
|
|
||||||
address.addressRegion,
|
|
||||||
address.postalCode
|
|
||||||
].filter(Boolean);
|
|
||||||
if (parts.length > 0) return parts.join(', ');
|
|
||||||
}
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,19 +0,0 @@
|
|||||||
import { Business, SearchParams } from '../../../types/business';
|
|
||||||
import { WebScraperProvider } from './webScraper';
|
|
||||||
|
|
||||||
export class BusinessProvider {
|
|
||||||
private scraper: WebScraperProvider;
|
|
||||||
|
|
||||||
constructor() {
|
|
||||||
this.scraper = new WebScraperProvider();
|
|
||||||
}
|
|
||||||
|
|
||||||
async search(params: SearchParams): Promise<Business[]> {
|
|
||||||
return this.scraper.search(params);
|
|
||||||
}
|
|
||||||
|
|
||||||
async getDetails(businessId: string): Promise<Business | null> {
|
|
||||||
// Implement detailed business lookup using stored data or additional scraping
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,111 +0,0 @@
|
|||||||
import { Business, SearchParams } from '../../../types/business';
|
|
||||||
import { searchWeb } from '../search'; // This is Perplexica's existing search function
|
|
||||||
import { parseHTML } from '../utils/parser';
|
|
||||||
|
|
||||||
export class WebScraperProvider {
|
|
||||||
async search(params: SearchParams): Promise<Business[]> {
|
|
||||||
const searchQueries = this.generateQueries(params);
|
|
||||||
const businesses: Business[] = [];
|
|
||||||
|
|
||||||
for (const query of searchQueries) {
|
|
||||||
// Use Perplexica's existing search functionality
|
|
||||||
const results = await searchWeb(query, {
|
|
||||||
maxResults: 20,
|
|
||||||
type: 'general' // or 'news' depending on what we want
|
|
||||||
});
|
|
||||||
|
|
||||||
for (const result of results) {
|
|
||||||
try {
|
|
||||||
const html = await fetch(result.url).then(res => res.text());
|
|
||||||
const businessData = await this.extractBusinessData(html, result.url);
|
|
||||||
if (businessData) {
|
|
||||||
businesses.push(businessData);
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`Failed to extract data from ${result.url}:`, error);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return this.deduplicateBusinesses(businesses);
|
|
||||||
}
|
|
||||||
|
|
||||||
private generateQueries(params: SearchParams): string[] {
|
|
||||||
const { location, category } = params;
|
|
||||||
return [
|
|
||||||
`${category} in ${location}`,
|
|
||||||
`${category} business ${location}`,
|
|
||||||
`best ${category} near ${location}`,
|
|
||||||
`${category} services ${location} reviews`
|
|
||||||
];
|
|
||||||
}
|
|
||||||
|
|
||||||
private async extractBusinessData(html: string, sourceUrl: string): Promise<Business | null> {
|
|
||||||
const $ = parseHTML(html);
|
|
||||||
|
|
||||||
// Different extraction logic based on source
|
|
||||||
if (sourceUrl.includes('yelp.com')) {
|
|
||||||
return this.extractYelpData($);
|
|
||||||
} else if (sourceUrl.includes('yellowpages.com')) {
|
|
||||||
return this.extractYellowPagesData($);
|
|
||||||
}
|
|
||||||
// ... other source-specific extractors
|
|
||||||
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
private extractYelpData($: any): Business | null {
|
|
||||||
try {
|
|
||||||
return {
|
|
||||||
id: crypto.randomUUID(),
|
|
||||||
name: $('.business-name').text().trim(),
|
|
||||||
phone: $('.phone-number').text().trim(),
|
|
||||||
address: $('.address').text().trim(),
|
|
||||||
city: $('.city').text().trim(),
|
|
||||||
state: $('.state').text().trim(),
|
|
||||||
zip: $('.zip').text().trim(),
|
|
||||||
category: $('.category-str-list').text().split(',').map(s => s.trim()),
|
|
||||||
rating: parseFloat($('.rating').text()),
|
|
||||||
reviewCount: parseInt($('.review-count').text()),
|
|
||||||
services: $('.services-list').text().split(',').map(s => s.trim()),
|
|
||||||
hours: this.extractHours($),
|
|
||||||
website: $('.website-link').attr('href'),
|
|
||||||
verified: false,
|
|
||||||
lastUpdated: new Date()
|
|
||||||
};
|
|
||||||
} catch (error) {
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private deduplicateBusinesses(businesses: Business[]): Business[] {
|
|
||||||
// Group by phone number and address to identify duplicates
|
|
||||||
const uniqueBusinesses = new Map<string, Business>();
|
|
||||||
|
|
||||||
for (const business of businesses) {
|
|
||||||
const key = `${business.phone}-${business.address}`.toLowerCase();
|
|
||||||
if (!uniqueBusinesses.has(key)) {
|
|
||||||
uniqueBusinesses.set(key, business);
|
|
||||||
} else {
|
|
||||||
// Merge data if we have additional information
|
|
||||||
const existing = uniqueBusinesses.get(key)!;
|
|
||||||
uniqueBusinesses.set(key, this.mergeBusinessData(existing, business));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return Array.from(uniqueBusinesses.values());
|
|
||||||
}
|
|
||||||
|
|
||||||
private mergeBusinessData(existing: Business, newData: Business): Business {
|
|
||||||
return {
|
|
||||||
...existing,
|
|
||||||
services: [...new Set([...existing.services, ...newData.services])],
|
|
||||||
rating: (existing.rating + newData.rating) / 2,
|
|
||||||
reviewCount: existing.reviewCount + newData.reviewCount,
|
|
||||||
// Keep the most complete data for other fields
|
|
||||||
website: existing.website || newData.website,
|
|
||||||
email: existing.email || newData.email,
|
|
||||||
hours: existing.hours || newData.hours
|
|
||||||
};
|
|
||||||
}
|
|
||||||
}
|
|
@@ -36,6 +36,22 @@ export const loadGeminiChatModels = async () => {
|
|||||||
apiKey: geminiApiKey,
|
apiKey: geminiApiKey,
|
||||||
}),
|
}),
|
||||||
},
|
},
|
||||||
|
'gemini-2.0-flash-exp': {
|
||||||
|
displayName: 'Gemini 2.0 Flash Exp',
|
||||||
|
model: new ChatGoogleGenerativeAI({
|
||||||
|
modelName: 'gemini-2.0-flash-exp',
|
||||||
|
temperature: 0.7,
|
||||||
|
apiKey: geminiApiKey,
|
||||||
|
}),
|
||||||
|
},
|
||||||
|
'gemini-2.0-flash-thinking-exp-01-21': {
|
||||||
|
displayName: 'Gemini 2.0 Flash Thinking Exp 01-21',
|
||||||
|
model: new ChatGoogleGenerativeAI({
|
||||||
|
modelName: 'gemini-2.0-flash-thinking-exp-01-21',
|
||||||
|
temperature: 0.7,
|
||||||
|
apiKey: geminiApiKey,
|
||||||
|
}),
|
||||||
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
return chatModels;
|
return chatModels;
|
||||||
|
@@ -4,6 +4,12 @@ import { loadOpenAIChatModels, loadOpenAIEmbeddingsModels } from './openai';
|
|||||||
import { loadAnthropicChatModels } from './anthropic';
|
import { loadAnthropicChatModels } from './anthropic';
|
||||||
import { loadTransformersEmbeddingsModels } from './transformers';
|
import { loadTransformersEmbeddingsModels } from './transformers';
|
||||||
import { loadGeminiChatModels, loadGeminiEmbeddingsModels } from './gemini';
|
import { loadGeminiChatModels, loadGeminiEmbeddingsModels } from './gemini';
|
||||||
|
import {
|
||||||
|
getCustomOpenaiApiKey,
|
||||||
|
getCustomOpenaiApiUrl,
|
||||||
|
getCustomOpenaiModelName,
|
||||||
|
} from '../../config';
|
||||||
|
import { ChatOpenAI } from '@langchain/openai';
|
||||||
|
|
||||||
const chatModelProviders = {
|
const chatModelProviders = {
|
||||||
openai: loadOpenAIChatModels,
|
openai: loadOpenAIChatModels,
|
||||||
@@ -30,7 +36,27 @@ export const getAvailableChatModelProviders = async () => {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
models['custom_openai'] = {};
|
const customOpenAiApiKey = getCustomOpenaiApiKey();
|
||||||
|
const customOpenAiApiUrl = getCustomOpenaiApiUrl();
|
||||||
|
const customOpenAiModelName = getCustomOpenaiModelName();
|
||||||
|
|
||||||
|
models['custom_openai'] = {
|
||||||
|
...(customOpenAiApiKey && customOpenAiApiUrl && customOpenAiModelName
|
||||||
|
? {
|
||||||
|
[customOpenAiModelName]: {
|
||||||
|
displayName: customOpenAiModelName,
|
||||||
|
model: new ChatOpenAI({
|
||||||
|
openAIApiKey: customOpenAiApiKey,
|
||||||
|
modelName: customOpenAiModelName,
|
||||||
|
temperature: 0.7,
|
||||||
|
configuration: {
|
||||||
|
baseURL: customOpenAiApiUrl,
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
: {}),
|
||||||
|
};
|
||||||
|
|
||||||
return models;
|
return models;
|
||||||
};
|
};
|
||||||
|
@@ -1,54 +0,0 @@
|
|||||||
import axios from 'axios';
|
|
||||||
import { config } from '../config';
|
|
||||||
|
|
||||||
interface SearchOptions {
|
|
||||||
maxResults?: number;
|
|
||||||
type?: 'general' | 'news';
|
|
||||||
engines?: string[];
|
|
||||||
}
|
|
||||||
|
|
||||||
interface SearchResult {
|
|
||||||
url: string;
|
|
||||||
title: string;
|
|
||||||
content: string;
|
|
||||||
score?: number;
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function searchWeb(
|
|
||||||
query: string,
|
|
||||||
options: SearchOptions = {}
|
|
||||||
): Promise<SearchResult[]> {
|
|
||||||
const {
|
|
||||||
maxResults = 20,
|
|
||||||
type = 'general',
|
|
||||||
engines = ['google', 'bing', 'duckduckgo']
|
|
||||||
} = options;
|
|
||||||
|
|
||||||
try {
|
|
||||||
const response = await axios.get(`${config.search.searxngUrl || process.env.SEARXNG_URL}/search`, {
|
|
||||||
params: {
|
|
||||||
q: query,
|
|
||||||
format: 'json',
|
|
||||||
categories: type,
|
|
||||||
engines: engines.join(','),
|
|
||||||
limit: maxResults
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
if (!response.data || !response.data.results) {
|
|
||||||
console.error('Invalid response from SearxNG:', response.data);
|
|
||||||
return [];
|
|
||||||
}
|
|
||||||
|
|
||||||
return response.data.results.map((result: any) => ({
|
|
||||||
url: result.url,
|
|
||||||
title: result.title,
|
|
||||||
content: result.content || result.snippet || '',
|
|
||||||
score: result.score
|
|
||||||
}));
|
|
||||||
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Search failed:', error);
|
|
||||||
throw error;
|
|
||||||
}
|
|
||||||
}
|
|
105
src/lib/searchEngines/bing.ts
Normal file
105
src/lib/searchEngines/bing.ts
Normal file
@@ -0,0 +1,105 @@
|
|||||||
|
import axios from 'axios';
|
||||||
|
import { getBingSubscriptionKey } from '../../config';
|
||||||
|
|
||||||
|
interface BingAPISearchResult {
|
||||||
|
_type: string;
|
||||||
|
name: string;
|
||||||
|
url: string;
|
||||||
|
displayUrl: string;
|
||||||
|
snippet?: string;
|
||||||
|
dateLastCrawled?: string;
|
||||||
|
thumbnailUrl?: string;
|
||||||
|
contentUrl?: string;
|
||||||
|
hostPageUrl?: string;
|
||||||
|
width?: number;
|
||||||
|
height?: number;
|
||||||
|
accentColor?: string;
|
||||||
|
contentSize?: string;
|
||||||
|
datePublished?: string;
|
||||||
|
encodingFormat?: string;
|
||||||
|
hostPageDisplayUrl?: string;
|
||||||
|
id?: string;
|
||||||
|
isLicensed?: boolean;
|
||||||
|
isFamilyFriendly?: boolean;
|
||||||
|
language?: string;
|
||||||
|
mediaUrl?: string;
|
||||||
|
motionThumbnailUrl?: string;
|
||||||
|
publisher?: string;
|
||||||
|
viewCount?: number;
|
||||||
|
webSearchUrl?: string;
|
||||||
|
primaryImageOfPage?: {
|
||||||
|
thumbnailUrl?: string;
|
||||||
|
width?: number;
|
||||||
|
height?: number;
|
||||||
|
};
|
||||||
|
video?: {
|
||||||
|
allowHttpsEmbed?: boolean;
|
||||||
|
embedHtml?: string;
|
||||||
|
allowMobileEmbed?: boolean;
|
||||||
|
viewCount?: number;
|
||||||
|
duration?: string;
|
||||||
|
};
|
||||||
|
image?: {
|
||||||
|
thumbnail?: {
|
||||||
|
contentUrl?: string;
|
||||||
|
width?: number;
|
||||||
|
height?: number;
|
||||||
|
};
|
||||||
|
imageInsightsToken?: string;
|
||||||
|
imageId?: string;
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
export const searchBingAPI = async (query: string) => {
|
||||||
|
try {
|
||||||
|
const bingApiKey = await getBingSubscriptionKey();
|
||||||
|
const url = new URL(`https://api.cognitive.microsoft.com/bing/v7.0/search`);
|
||||||
|
url.searchParams.append('q', query);
|
||||||
|
url.searchParams.append('responseFilter', 'Webpages,Images,Videos');
|
||||||
|
|
||||||
|
const res = await axios.get(url.toString(), {
|
||||||
|
headers: {
|
||||||
|
'Ocp-Apim-Subscription-Key': bingApiKey,
|
||||||
|
Accept: 'application/json',
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
if (res.data.error) {
|
||||||
|
throw new Error(`Bing API Error: ${res.data.error.message}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const originalres = res.data;
|
||||||
|
|
||||||
|
// Extract web, image, and video results
|
||||||
|
const webResults = originalres.webPages?.value || [];
|
||||||
|
const imageResults = originalres.images?.value || [];
|
||||||
|
const videoResults = originalres.videos?.value || [];
|
||||||
|
|
||||||
|
const results = webResults.map((item: BingAPISearchResult) => ({
|
||||||
|
title: item.name,
|
||||||
|
url: item.url,
|
||||||
|
content: item.snippet,
|
||||||
|
img_src:
|
||||||
|
item.primaryImageOfPage?.thumbnailUrl ||
|
||||||
|
imageResults.find((img: any) => img.hostPageUrl === item.url)
|
||||||
|
?.thumbnailUrl ||
|
||||||
|
videoResults.find((vid: any) => vid.hostPageUrl === item.url)
|
||||||
|
?.thumbnailUrl,
|
||||||
|
...(item.video && {
|
||||||
|
videoData: {
|
||||||
|
duration: item.video.duration,
|
||||||
|
embedUrl: item.video.embedHtml?.match(/src="(.*?)"/)?.[1],
|
||||||
|
},
|
||||||
|
publisher: item.publisher,
|
||||||
|
datePublished: item.datePublished,
|
||||||
|
}),
|
||||||
|
}));
|
||||||
|
|
||||||
|
return { results, originalres };
|
||||||
|
} catch (error) {
|
||||||
|
const errorMessage = error.response?.data
|
||||||
|
? JSON.stringify(error.response.data, null, 2)
|
||||||
|
: error.message || 'Unknown error';
|
||||||
|
throw new Error(`Bing API Error: ${errorMessage}`);
|
||||||
|
}
|
||||||
|
};
|
102
src/lib/searchEngines/brave.ts
Normal file
102
src/lib/searchEngines/brave.ts
Normal file
@@ -0,0 +1,102 @@
|
|||||||
|
import axios from 'axios';
|
||||||
|
import { getBraveApiKey } from '../../config';
|
||||||
|
|
||||||
|
interface BraveSearchResult {
|
||||||
|
title: string;
|
||||||
|
url: string;
|
||||||
|
content?: string;
|
||||||
|
img_src?: string;
|
||||||
|
age?: string;
|
||||||
|
family_friendly?: boolean;
|
||||||
|
language?: string;
|
||||||
|
video?: {
|
||||||
|
embedUrl?: string;
|
||||||
|
duration?: string;
|
||||||
|
};
|
||||||
|
rating?: {
|
||||||
|
value: number;
|
||||||
|
scale: number;
|
||||||
|
};
|
||||||
|
products?: Array<{
|
||||||
|
name: string;
|
||||||
|
price?: string;
|
||||||
|
}>;
|
||||||
|
recipe?: {
|
||||||
|
ingredients?: string[];
|
||||||
|
cookTime?: string;
|
||||||
|
};
|
||||||
|
meta?: {
|
||||||
|
fetched?: string;
|
||||||
|
lastCrawled?: string;
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
export const searchBraveAPI = async (
|
||||||
|
query: string,
|
||||||
|
numResults: number = 20,
|
||||||
|
): Promise<{ results: BraveSearchResult[]; originalres: any }> => {
|
||||||
|
try {
|
||||||
|
const braveApiKey = await getBraveApiKey();
|
||||||
|
const url = new URL(`https://api.search.brave.com/res/v1/web/search`);
|
||||||
|
|
||||||
|
url.searchParams.append('q', query);
|
||||||
|
url.searchParams.append('count', numResults.toString());
|
||||||
|
|
||||||
|
const res = await axios.get(url.toString(), {
|
||||||
|
headers: {
|
||||||
|
'X-Subscription-Token': braveApiKey,
|
||||||
|
Accept: 'application/json',
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
if (res.data.error) {
|
||||||
|
throw new Error(`Brave API Error: ${res.data.error.message}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const originalres = res.data;
|
||||||
|
const webResults = originalres.web?.results || [];
|
||||||
|
|
||||||
|
const results: BraveSearchResult[] = webResults.map((item: any) => ({
|
||||||
|
title: item.title,
|
||||||
|
url: item.url,
|
||||||
|
content: item.description,
|
||||||
|
img_src: item.thumbnail?.src || item.deep_results?.images?.[0]?.src,
|
||||||
|
age: item.age,
|
||||||
|
family_friendly: item.family_friendly,
|
||||||
|
language: item.language,
|
||||||
|
video: item.video
|
||||||
|
? {
|
||||||
|
embedUrl: item.video.embed_url,
|
||||||
|
duration: item.video.duration,
|
||||||
|
}
|
||||||
|
: undefined,
|
||||||
|
rating: item.rating
|
||||||
|
? {
|
||||||
|
value: item.rating.value,
|
||||||
|
scale: item.rating.scale_max,
|
||||||
|
}
|
||||||
|
: undefined,
|
||||||
|
products: item.deep_results?.product_cluster?.map((p: any) => ({
|
||||||
|
name: p.name,
|
||||||
|
price: p.price,
|
||||||
|
})),
|
||||||
|
recipe: item.recipe
|
||||||
|
? {
|
||||||
|
ingredients: item.recipe.ingredients,
|
||||||
|
cookTime: item.recipe.cook_time,
|
||||||
|
}
|
||||||
|
: undefined,
|
||||||
|
meta: {
|
||||||
|
fetched: item.meta?.fetched,
|
||||||
|
lastCrawled: item.meta?.last_crawled,
|
||||||
|
},
|
||||||
|
}));
|
||||||
|
|
||||||
|
return { results, originalres };
|
||||||
|
} catch (error) {
|
||||||
|
const errorMessage = error.response?.data
|
||||||
|
? JSON.stringify(error.response.data, null, 2)
|
||||||
|
: error.message || 'Unknown error';
|
||||||
|
throw new Error(`Brave API Error: ${errorMessage}`);
|
||||||
|
}
|
||||||
|
};
|
85
src/lib/searchEngines/google_pse.ts
Normal file
85
src/lib/searchEngines/google_pse.ts
Normal file
@@ -0,0 +1,85 @@
|
|||||||
|
import axios from 'axios';
|
||||||
|
import { getGoogleApiKey, getGoogleCseId } from '../../config';
|
||||||
|
|
||||||
|
interface GooglePSESearchResult {
|
||||||
|
kind: string;
|
||||||
|
title: string;
|
||||||
|
htmlTitle: string;
|
||||||
|
link: string;
|
||||||
|
displayLink: string;
|
||||||
|
snippet?: string;
|
||||||
|
htmlSnippet?: string;
|
||||||
|
cacheId?: string;
|
||||||
|
formattedUrl: string;
|
||||||
|
htmlFormattedUrl: string;
|
||||||
|
pagemap?: {
|
||||||
|
videoobject: any;
|
||||||
|
cse_thumbnail?: Array<{
|
||||||
|
src: string;
|
||||||
|
width: string;
|
||||||
|
height: string;
|
||||||
|
}>;
|
||||||
|
metatags?: Array<{
|
||||||
|
[key: string]: string;
|
||||||
|
author?: string;
|
||||||
|
}>;
|
||||||
|
cse_image?: Array<{
|
||||||
|
src: string;
|
||||||
|
}>;
|
||||||
|
};
|
||||||
|
fileFormat?: string;
|
||||||
|
image?: {
|
||||||
|
contextLink: string;
|
||||||
|
thumbnailLink: string;
|
||||||
|
};
|
||||||
|
mime?: string;
|
||||||
|
labels?: Array<{
|
||||||
|
name: string;
|
||||||
|
displayName: string;
|
||||||
|
}>;
|
||||||
|
}
|
||||||
|
|
||||||
|
export const searchGooglePSE = async (query: string) => {
|
||||||
|
try {
|
||||||
|
const [googleApiKey, googleCseID] = await Promise.all([
|
||||||
|
getGoogleApiKey(),
|
||||||
|
getGoogleCseId(),
|
||||||
|
]);
|
||||||
|
|
||||||
|
const url = new URL(`https://www.googleapis.com/customsearch/v1`);
|
||||||
|
url.searchParams.append('q', query);
|
||||||
|
url.searchParams.append('cx', googleCseID);
|
||||||
|
url.searchParams.append('key', googleApiKey);
|
||||||
|
|
||||||
|
const res = await axios.get(url.toString());
|
||||||
|
|
||||||
|
if (res.data.error) {
|
||||||
|
throw new Error(`Google PSE Error: ${res.data.error.message}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const originalres = res.data.items;
|
||||||
|
|
||||||
|
const results = originalres.map((item: GooglePSESearchResult) => ({
|
||||||
|
title: item.title,
|
||||||
|
url: item.link,
|
||||||
|
content: item.snippet,
|
||||||
|
img_src:
|
||||||
|
item.pagemap?.cse_image?.[0]?.src ||
|
||||||
|
item.pagemap?.cse_thumbnail?.[0]?.src ||
|
||||||
|
item.image?.thumbnailLink,
|
||||||
|
...(item.pagemap?.videoobject?.[0] && {
|
||||||
|
videoData: {
|
||||||
|
duration: item.pagemap.videoobject[0].duration,
|
||||||
|
embedUrl: item.pagemap.videoobject[0].embedurl,
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
}));
|
||||||
|
|
||||||
|
return { results, originalres };
|
||||||
|
} catch (error) {
|
||||||
|
const errorMessage = error.response?.data
|
||||||
|
? JSON.stringify(error.response.data, null, 2)
|
||||||
|
: error.message || 'Unknown error';
|
||||||
|
throw new Error(`Google PSE Error: ${errorMessage}`);
|
||||||
|
}
|
||||||
|
};
|
47
src/lib/searchEngines/searxng.ts
Normal file
47
src/lib/searchEngines/searxng.ts
Normal file
@@ -0,0 +1,47 @@
|
|||||||
|
import axios from 'axios';
|
||||||
|
import { getSearxngApiEndpoint } from '../../config';
|
||||||
|
|
||||||
|
interface SearxngSearchOptions {
|
||||||
|
categories?: string[];
|
||||||
|
engines?: string[];
|
||||||
|
language?: string;
|
||||||
|
pageno?: number;
|
||||||
|
}
|
||||||
|
|
||||||
|
interface SearxngSearchResult {
|
||||||
|
title: string;
|
||||||
|
url: string;
|
||||||
|
img_src?: string;
|
||||||
|
thumbnail_src?: string;
|
||||||
|
thumbnail?: string;
|
||||||
|
content?: string;
|
||||||
|
author?: string;
|
||||||
|
iframe_src?: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
export const searchSearxng = async (
|
||||||
|
query: string,
|
||||||
|
opts?: SearxngSearchOptions,
|
||||||
|
) => {
|
||||||
|
const searxngURL = getSearxngApiEndpoint();
|
||||||
|
|
||||||
|
const url = new URL(`${searxngURL}/search?format=json`);
|
||||||
|
url.searchParams.append('q', query);
|
||||||
|
|
||||||
|
if (opts) {
|
||||||
|
Object.keys(opts).forEach((key) => {
|
||||||
|
if (Array.isArray(opts[key])) {
|
||||||
|
url.searchParams.append(key, opts[key].join(','));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
url.searchParams.append(key, opts[key]);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
const res = await axios.get(url.toString());
|
||||||
|
|
||||||
|
const results: SearxngSearchResult[] = res.data.results;
|
||||||
|
const suggestions: string[] = res.data.suggestions;
|
||||||
|
|
||||||
|
return { results, suggestions };
|
||||||
|
};
|
79
src/lib/searchEngines/yacy.ts
Normal file
79
src/lib/searchEngines/yacy.ts
Normal file
@@ -0,0 +1,79 @@
|
|||||||
|
import axios from 'axios';
|
||||||
|
import { getYacyJsonEndpoint } from '../../config';
|
||||||
|
|
||||||
|
interface YaCySearchResult {
|
||||||
|
channels: {
|
||||||
|
title: string;
|
||||||
|
description: string;
|
||||||
|
link: string;
|
||||||
|
image: {
|
||||||
|
url: string;
|
||||||
|
title: string;
|
||||||
|
link: string;
|
||||||
|
};
|
||||||
|
startIndex: string;
|
||||||
|
itemsPerPage: string;
|
||||||
|
searchTerms: string;
|
||||||
|
items: {
|
||||||
|
title: string;
|
||||||
|
link: string;
|
||||||
|
code: string;
|
||||||
|
description: string;
|
||||||
|
pubDate: string;
|
||||||
|
image?: string;
|
||||||
|
size: string;
|
||||||
|
sizename: string;
|
||||||
|
guid: string;
|
||||||
|
faviconUrl: string;
|
||||||
|
host: string;
|
||||||
|
path: string;
|
||||||
|
file: string;
|
||||||
|
urlhash: string;
|
||||||
|
ranking: string;
|
||||||
|
}[];
|
||||||
|
navigation: {
|
||||||
|
facetname: string;
|
||||||
|
displayname: string;
|
||||||
|
type: string;
|
||||||
|
min: string;
|
||||||
|
max: string;
|
||||||
|
mean: string;
|
||||||
|
elements: {
|
||||||
|
name: string;
|
||||||
|
count: string;
|
||||||
|
modifier: string;
|
||||||
|
url: string;
|
||||||
|
}[];
|
||||||
|
}[];
|
||||||
|
}[];
|
||||||
|
}
|
||||||
|
|
||||||
|
export const searchYaCy = async (query: string, numResults: number = 20) => {
|
||||||
|
try {
|
||||||
|
const yacyBaseUrl = getYacyJsonEndpoint();
|
||||||
|
|
||||||
|
const url = new URL(`${yacyBaseUrl}/yacysearch.json`);
|
||||||
|
url.searchParams.append('query', query);
|
||||||
|
url.searchParams.append('count', numResults.toString());
|
||||||
|
|
||||||
|
const res = await axios.get(url.toString());
|
||||||
|
|
||||||
|
const originalres = res.data as YaCySearchResult;
|
||||||
|
|
||||||
|
const results = originalres.channels[0].items.map((item) => ({
|
||||||
|
title: item.title,
|
||||||
|
url: item.link,
|
||||||
|
content: item.description,
|
||||||
|
img_src: item.image || null,
|
||||||
|
pubDate: item.pubDate,
|
||||||
|
host: item.host,
|
||||||
|
}));
|
||||||
|
|
||||||
|
return { results, originalres };
|
||||||
|
} catch (error) {
|
||||||
|
const errorMessage = error.response?.data
|
||||||
|
? JSON.stringify(error.response.data, null, 2)
|
||||||
|
: error.message || 'Unknown error';
|
||||||
|
throw new Error(`YaCy Error: ${errorMessage}`);
|
||||||
|
}
|
||||||
|
};
|
@@ -1,313 +0,0 @@
|
|||||||
import axios from 'axios';
|
|
||||||
import * as cheerio from 'cheerio';
|
|
||||||
import { createWorker } from 'tesseract.js';
|
|
||||||
import { env } from '../config/env';
|
|
||||||
import { OllamaService } from './services/ollamaService';
|
|
||||||
import { BusinessData } from './types';
|
|
||||||
import { db } from './services/databaseService';
|
|
||||||
import { generateBusinessId } from './utils';
|
|
||||||
import { extractContactFromHtml, extractCleanAddress } from './utils/scraper';
|
|
||||||
import { GeocodingService } from './services/geocodingService';
|
|
||||||
import { cleanAddress, formatPhoneNumber, cleanEmail, cleanDescription } from './utils/dataCleanup';
|
|
||||||
import { CleanupService } from './services/cleanupService';
|
|
||||||
|
|
||||||
// Define interfaces used only in this file
|
|
||||||
interface SearchResult {
|
|
||||||
url: string;
|
|
||||||
title: string;
|
|
||||||
content: string;
|
|
||||||
phone?: string;
|
|
||||||
email?: string;
|
|
||||||
address?: string;
|
|
||||||
website?: string;
|
|
||||||
rating?: number;
|
|
||||||
coordinates?: {
|
|
||||||
lat: number;
|
|
||||||
lng: number;
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
interface ContactInfo {
|
|
||||||
phone?: string;
|
|
||||||
email?: string;
|
|
||||||
address?: string;
|
|
||||||
description?: string;
|
|
||||||
openingHours?: string[];
|
|
||||||
}
|
|
||||||
|
|
||||||
// Export the main search function
|
|
||||||
export async function searchBusinesses(
|
|
||||||
query: string,
|
|
||||||
options: { onProgress?: (status: string, progress: number) => void } = {}
|
|
||||||
): Promise<BusinessData[]> {
|
|
||||||
try {
|
|
||||||
console.log('Processing search query:', query);
|
|
||||||
const [searchTerm, location] = query.split(' in ').map(s => s.trim());
|
|
||||||
if (!searchTerm || !location) {
|
|
||||||
throw new Error('Invalid search query format. Use: "search term in location"');
|
|
||||||
}
|
|
||||||
|
|
||||||
options.onProgress?.('Checking cache', 0);
|
|
||||||
|
|
||||||
// Check cache first
|
|
||||||
const cacheKey = `search:${searchTerm}:${location}`;
|
|
||||||
let results = await db.getFromCache(cacheKey);
|
|
||||||
|
|
||||||
if (!results) {
|
|
||||||
// Check database for existing businesses
|
|
||||||
console.log('Searching database for:', searchTerm, 'in', location);
|
|
||||||
const existingBusinesses = await db.searchBusinesses(searchTerm, location);
|
|
||||||
|
|
||||||
// Start search immediately
|
|
||||||
console.log('Starting web search');
|
|
||||||
const searchPromise = performSearch(searchTerm, location, options);
|
|
||||||
|
|
||||||
if (existingBusinesses.length > 0) {
|
|
||||||
console.log(`Found ${existingBusinesses.length} existing businesses`);
|
|
||||||
options.onProgress?.('Retrieved from database', 50);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Wait for new results
|
|
||||||
const newResults = await searchPromise;
|
|
||||||
console.log(`Got ${newResults.length} new results from search`);
|
|
||||||
|
|
||||||
// Merge results, removing duplicates by ID
|
|
||||||
const allResults = [...existingBusinesses];
|
|
||||||
for (const result of newResults) {
|
|
||||||
if (!allResults.some(b => b.id === result.id)) {
|
|
||||||
allResults.push(result);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
console.log(`Total unique results: ${allResults.length}`);
|
|
||||||
|
|
||||||
// Cache combined results
|
|
||||||
await db.saveToCache(cacheKey, allResults, env.cache.durationHours * 60 * 60 * 1000);
|
|
||||||
|
|
||||||
console.log(`Returning ${allResults.length} total results (${existingBusinesses.length} existing + ${newResults.length} new)`);
|
|
||||||
results = allResults;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Clean all results using LLM
|
|
||||||
options.onProgress?.('Cleaning data', 75);
|
|
||||||
const cleanedResults = await CleanupService.cleanBusinessRecords(results);
|
|
||||||
|
|
||||||
options.onProgress?.('Search complete', 100);
|
|
||||||
return cleanedResults;
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Search error:', error);
|
|
||||||
return [];
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async function performSearch(
|
|
||||||
searchTerm: string,
|
|
||||||
location: string,
|
|
||||||
options: any
|
|
||||||
): Promise<BusinessData[]> {
|
|
||||||
const queries = [
|
|
||||||
searchTerm + ' ' + location,
|
|
||||||
searchTerm + ' business near ' + location,
|
|
||||||
searchTerm + ' services ' + location,
|
|
||||||
'local ' + searchTerm + ' ' + location
|
|
||||||
];
|
|
||||||
|
|
||||||
options.onProgress?.('Searching multiple sources', 25);
|
|
||||||
|
|
||||||
let allResults: SearchResult[] = [];
|
|
||||||
const seenUrls = new Set<string>();
|
|
||||||
|
|
||||||
for (const q of queries) {
|
|
||||||
try {
|
|
||||||
const response = await axios.get(`${env.searxng.currentUrl}/search`, {
|
|
||||||
params: {
|
|
||||||
q,
|
|
||||||
format: 'json',
|
|
||||||
engines: 'google,google_maps',
|
|
||||||
language: 'en-US',
|
|
||||||
time_range: '',
|
|
||||||
safesearch: 1
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
if (response.data?.results) {
|
|
||||||
// Deduplicate results
|
|
||||||
const newResults = response.data.results.filter((result: SearchResult) => {
|
|
||||||
if (seenUrls.has(result.url)) {
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
seenUrls.add(result.url);
|
|
||||||
return true;
|
|
||||||
});
|
|
||||||
|
|
||||||
console.log(`Found ${newResults.length} unique results from ${response.data.results[0]?.engine}`);
|
|
||||||
allResults = allResults.concat(newResults);
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`Search failed for query "${q}":`, error);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
options.onProgress?.('Processing results', 50);
|
|
||||||
|
|
||||||
const filteredResults = allResults.filter(isValidBusinessResult);
|
|
||||||
const processedResults = await processResults(filteredResults, location);
|
|
||||||
|
|
||||||
// Save results to database
|
|
||||||
for (const result of processedResults) {
|
|
||||||
await db.saveBusiness(result).catch(console.error);
|
|
||||||
}
|
|
||||||
|
|
||||||
options.onProgress?.('Search complete', 100);
|
|
||||||
return processedResults;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Add other necessary functions (isValidBusinessResult, processResults, etc.)
|
|
||||||
function isValidBusinessResult(result: SearchResult): boolean {
|
|
||||||
// Skip listing/directory pages and search results
|
|
||||||
const skipPatterns = [
|
|
||||||
'tripadvisor.com',
|
|
||||||
'yelp.com',
|
|
||||||
'opentable.com',
|
|
||||||
'restaurants-for-sale',
|
|
||||||
'guide.michelin.com',
|
|
||||||
'denver.org',
|
|
||||||
'/blog/',
|
|
||||||
'/maps/',
|
|
||||||
'search?',
|
|
||||||
'features/',
|
|
||||||
'/lists/',
|
|
||||||
'reddit.com',
|
|
||||||
'eater.com'
|
|
||||||
];
|
|
||||||
|
|
||||||
if (skipPatterns.some(pattern => result.url.toLowerCase().includes(pattern))) {
|
|
||||||
console.log(`Skipping listing page: ${result.url}`);
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Must have a title
|
|
||||||
if (!result.title || result.title.length < 2) {
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Skip results that look like articles or lists
|
|
||||||
const articlePatterns = [
|
|
||||||
'Best',
|
|
||||||
'Top',
|
|
||||||
'Guide',
|
|
||||||
'Where to',
|
|
||||||
'Welcome to',
|
|
||||||
'Updated',
|
|
||||||
'Near',
|
|
||||||
'Restaurants in'
|
|
||||||
];
|
|
||||||
|
|
||||||
if (articlePatterns.some(pattern => result.title.includes(pattern))) {
|
|
||||||
console.log(`Skipping article: ${result.title}`);
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Only accept results that look like actual business pages
|
|
||||||
const businessPatterns = [
|
|
||||||
'menu',
|
|
||||||
'reservation',
|
|
||||||
'location',
|
|
||||||
'contact',
|
|
||||||
'about-us',
|
|
||||||
'home'
|
|
||||||
];
|
|
||||||
|
|
||||||
const hasBusinessPattern = businessPatterns.some(pattern =>
|
|
||||||
result.url.toLowerCase().includes(pattern) ||
|
|
||||||
result.content.toLowerCase().includes(pattern)
|
|
||||||
);
|
|
||||||
|
|
||||||
if (!hasBusinessPattern) {
|
|
||||||
console.log(`Skipping non-business page: ${result.url}`);
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
async function processResults(results: SearchResult[], location: string): Promise<BusinessData[]> {
|
|
||||||
const processedResults: BusinessData[] = [];
|
|
||||||
|
|
||||||
// Get coordinates for the location
|
|
||||||
const locationGeo = await GeocodingService.geocode(location);
|
|
||||||
const defaultCoords = locationGeo || { lat: 39.7392, lng: -104.9903 };
|
|
||||||
|
|
||||||
for (const result of results) {
|
|
||||||
try {
|
|
||||||
// Extract contact info from webpage
|
|
||||||
const contactInfo = await extractContactFromHtml(result.url);
|
|
||||||
|
|
||||||
// Create initial business record
|
|
||||||
const business: BusinessData = {
|
|
||||||
id: generateBusinessId(result),
|
|
||||||
name: cleanBusinessName(result.title),
|
|
||||||
phone: result.phone || contactInfo.phone || '',
|
|
||||||
email: result.email || contactInfo.email || '',
|
|
||||||
address: result.address || contactInfo.address || '',
|
|
||||||
rating: result.rating || 0,
|
|
||||||
website: result.website || result.url || '',
|
|
||||||
logo: '',
|
|
||||||
source: 'web',
|
|
||||||
description: result.content || contactInfo.description || '',
|
|
||||||
location: defaultCoords,
|
|
||||||
openingHours: contactInfo.openingHours
|
|
||||||
};
|
|
||||||
|
|
||||||
// Clean up the record using LLM
|
|
||||||
const cleanedBusiness = await CleanupService.cleanBusinessRecord(business);
|
|
||||||
|
|
||||||
// Get coordinates for cleaned address
|
|
||||||
if (cleanedBusiness.address) {
|
|
||||||
const addressGeo = await GeocodingService.geocode(cleanedBusiness.address);
|
|
||||||
if (addressGeo) {
|
|
||||||
cleanedBusiness.location = addressGeo;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Only add if we have at least a name and either phone or address
|
|
||||||
if (cleanedBusiness.name && (cleanedBusiness.phone || cleanedBusiness.address)) {
|
|
||||||
processedResults.push(cleanedBusiness);
|
|
||||||
}
|
|
||||||
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`Error processing result ${result.title}:`, error);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return processedResults;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Helper functions
|
|
||||||
function cleanBusinessName(name: string): string {
|
|
||||||
// Remove common suffixes and prefixes
|
|
||||||
const cleanName = name
|
|
||||||
.replace(/^(The|A|An)\s+/i, '')
|
|
||||||
.replace(/\s+(-|–|—|:).*$/, '')
|
|
||||||
.replace(/\s*\([^)]*\)/g, '')
|
|
||||||
.trim();
|
|
||||||
|
|
||||||
return cleanName;
|
|
||||||
}
|
|
||||||
|
|
||||||
async function getLocationCoordinates(address: string): Promise<{lat: number, lng: number}> {
|
|
||||||
// Implement geocoding here
|
|
||||||
// For now, return default coordinates for Denver
|
|
||||||
return { lat: 39.7392, lng: -104.9903 };
|
|
||||||
}
|
|
||||||
|
|
||||||
async function searchAndUpdateInBackground(searchTerm: string, location: string) {
|
|
||||||
try {
|
|
||||||
const results = await performSearch(searchTerm, location, {});
|
|
||||||
console.log(`Updated ${results.length} businesses in background`);
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Background search error:', error);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// ... rest of the file remains the same
|
|
@@ -1,111 +0,0 @@
|
|||||||
import axios from 'axios';
|
|
||||||
import * as cheerio from 'cheerio';
|
|
||||||
import { Cache } from '../utils/cache';
|
|
||||||
import { RateLimiter } from '../utils/rateLimiter';
|
|
||||||
|
|
||||||
interface CrawlResult {
|
|
||||||
mainContent: string;
|
|
||||||
contactInfo: string;
|
|
||||||
aboutInfo: string;
|
|
||||||
structuredData: any;
|
|
||||||
}
|
|
||||||
|
|
||||||
export class BusinessCrawler {
|
|
||||||
private cache: Cache<CrawlResult>;
|
|
||||||
private rateLimiter: RateLimiter;
|
|
||||||
|
|
||||||
constructor() {
|
|
||||||
this.cache = new Cache<CrawlResult>(60); // 1 hour cache
|
|
||||||
this.rateLimiter = new RateLimiter();
|
|
||||||
}
|
|
||||||
|
|
||||||
async crawlBusinessSite(url: string): Promise<CrawlResult> {
|
|
||||||
// Check cache first
|
|
||||||
const cached = this.cache.get(url);
|
|
||||||
if (cached) return cached;
|
|
||||||
|
|
||||||
await this.rateLimiter.waitForSlot();
|
|
||||||
|
|
||||||
try {
|
|
||||||
const mainPage = await this.fetchPage(url);
|
|
||||||
const $ = cheerio.load(mainPage);
|
|
||||||
|
|
||||||
// Get all important URLs
|
|
||||||
const contactUrl = this.findContactPage($, url);
|
|
||||||
const aboutUrl = this.findAboutPage($, url);
|
|
||||||
|
|
||||||
// Crawl additional pages
|
|
||||||
const [contactPage, aboutPage] = await Promise.all([
|
|
||||||
contactUrl ? this.fetchPage(contactUrl) : '',
|
|
||||||
aboutUrl ? this.fetchPage(aboutUrl) : ''
|
|
||||||
]);
|
|
||||||
|
|
||||||
// Extract structured data
|
|
||||||
const structuredData = this.extractStructuredData($);
|
|
||||||
|
|
||||||
const result = {
|
|
||||||
mainContent: $('body').text(),
|
|
||||||
contactInfo: contactPage,
|
|
||||||
aboutInfo: aboutPage,
|
|
||||||
structuredData
|
|
||||||
};
|
|
||||||
|
|
||||||
this.cache.set(url, result);
|
|
||||||
return result;
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`Failed to crawl ${url}:`, error);
|
|
||||||
return {
|
|
||||||
mainContent: '',
|
|
||||||
contactInfo: '',
|
|
||||||
aboutInfo: '',
|
|
||||||
structuredData: {}
|
|
||||||
};
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private async fetchPage(url: string): Promise<string> {
|
|
||||||
try {
|
|
||||||
const response = await axios.get(url, {
|
|
||||||
timeout: 10000,
|
|
||||||
headers: {
|
|
||||||
'User-Agent': 'Mozilla/5.0 (compatible; BizSearch/1.0; +http://localhost:3000/about)',
|
|
||||||
}
|
|
||||||
});
|
|
||||||
return response.data;
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`Failed to fetch ${url}:`, error);
|
|
||||||
return '';
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private findContactPage($: cheerio.CheerioAPI, baseUrl: string): string | null {
|
|
||||||
const contactLinks = $('a[href*="contact"], a:contains("Contact")');
|
|
||||||
if (contactLinks.length > 0) {
|
|
||||||
const href = contactLinks.first().attr('href');
|
|
||||||
return href ? new URL(href, baseUrl).toString() : null;
|
|
||||||
}
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
private findAboutPage($: cheerio.CheerioAPI, baseUrl: string): string | null {
|
|
||||||
const aboutLinks = $('a[href*="about"], a:contains("About")');
|
|
||||||
if (aboutLinks.length > 0) {
|
|
||||||
const href = aboutLinks.first().attr('href');
|
|
||||||
return href ? new URL(href, baseUrl).toString() : null;
|
|
||||||
}
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
private extractStructuredData($: cheerio.CheerioAPI): any {
|
|
||||||
const structuredData: any[] = [];
|
|
||||||
$('script[type="application/ld+json"]').each((_, element) => {
|
|
||||||
try {
|
|
||||||
const data = JSON.parse($(element).html() || '{}');
|
|
||||||
structuredData.push(data);
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Failed to parse structured data:', error);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
return structuredData;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,71 +0,0 @@
|
|||||||
import { supabase } from '../supabase';
|
|
||||||
import { BusinessData } from '../searxng';
|
|
||||||
|
|
||||||
export class CacheService {
|
|
||||||
static async getCachedResults(category: string, location: string): Promise<BusinessData[] | null> {
|
|
||||||
try {
|
|
||||||
const { data, error } = await supabase
|
|
||||||
.from('search_cache')
|
|
||||||
.select('results')
|
|
||||||
.eq('category', category.toLowerCase())
|
|
||||||
.eq('location', location.toLowerCase())
|
|
||||||
.gt('expires_at', new Date().toISOString())
|
|
||||||
.order('created_at', { ascending: false })
|
|
||||||
.limit(1)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (error) throw error;
|
|
||||||
return data ? data.results : null;
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Cache lookup failed:', error);
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static async cacheResults(
|
|
||||||
category: string,
|
|
||||||
location: string,
|
|
||||||
results: BusinessData[],
|
|
||||||
expiresInDays: number = 7
|
|
||||||
): Promise<void> {
|
|
||||||
try {
|
|
||||||
const expiresAt = new Date();
|
|
||||||
expiresAt.setDate(expiresAt.getDate() + expiresInDays);
|
|
||||||
|
|
||||||
const { error } = await supabase
|
|
||||||
.from('search_cache')
|
|
||||||
.insert({
|
|
||||||
query: `${category} in ${location}`,
|
|
||||||
category: category.toLowerCase(),
|
|
||||||
location: location.toLowerCase(),
|
|
||||||
results,
|
|
||||||
expires_at: expiresAt.toISOString()
|
|
||||||
});
|
|
||||||
|
|
||||||
if (error) throw error;
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Failed to cache results:', error);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static async updateCache(
|
|
||||||
category: string,
|
|
||||||
location: string,
|
|
||||||
newResults: BusinessData[]
|
|
||||||
): Promise<void> {
|
|
||||||
try {
|
|
||||||
const { error } = await supabase
|
|
||||||
.from('search_cache')
|
|
||||||
.update({
|
|
||||||
results: newResults,
|
|
||||||
updated_at: new Date().toISOString()
|
|
||||||
})
|
|
||||||
.eq('category', category.toLowerCase())
|
|
||||||
.eq('location', location.toLowerCase());
|
|
||||||
|
|
||||||
if (error) throw error;
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Failed to update cache:', error);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,235 +0,0 @@
|
|||||||
import { DeepSeekService } from './deepseekService';
|
|
||||||
import { Business } from '../types';
|
|
||||||
import { db } from './databaseService';
|
|
||||||
|
|
||||||
// Constants for validation and scoring
|
|
||||||
const BATCH_SIZE = 3; // Process businesses in small batches to avoid overwhelming LLM
|
|
||||||
const LLM_TIMEOUT = 30000; // 30 second timeout for LLM requests
|
|
||||||
const MIN_CONFIDENCE_SCORE = 0.7; // Minimum score required to cache results
|
|
||||||
const VALID_EMAIL_REGEX = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
|
|
||||||
const VALID_PHONE_REGEX = /^\(\d{3}\) \d{3}-\d{4}$/;
|
|
||||||
const VALID_ADDRESS_REGEX = /^\d+.*(?:street|st|avenue|ave|road|rd|boulevard|blvd|lane|ln|drive|dr|court|ct|circle|cir|way|parkway|pkwy|place|pl),?\s+[a-z ]+,\s*[a-z]{2}\s+\d{5}$/i;
|
|
||||||
|
|
||||||
export class CleanupService {
|
|
||||||
/**
|
|
||||||
* Attempts to clean business data using LLM with timeout protection.
|
|
||||||
* Falls back to original data if LLM fails or times out.
|
|
||||||
*/
|
|
||||||
private static async cleanWithLLM(prompt: string, originalBusiness: Business): Promise<string> {
|
|
||||||
try {
|
|
||||||
const timeoutPromise = new Promise((_, reject) => {
|
|
||||||
setTimeout(() => reject(new Error('LLM timeout')), LLM_TIMEOUT);
|
|
||||||
});
|
|
||||||
|
|
||||||
const llmPromise = DeepSeekService.chat([{
|
|
||||||
role: 'user',
|
|
||||||
content: prompt
|
|
||||||
}]);
|
|
||||||
|
|
||||||
const response = await Promise.race([llmPromise, timeoutPromise]);
|
|
||||||
return (response as string).trim();
|
|
||||||
} catch (error) {
|
|
||||||
console.error('LLM cleanup error:', error);
|
|
||||||
// On timeout, return the original values
|
|
||||||
return `
|
|
||||||
Address: ${originalBusiness.address}
|
|
||||||
Phone: ${originalBusiness.phone}
|
|
||||||
Email: ${originalBusiness.email}
|
|
||||||
Description: ${originalBusiness.description}
|
|
||||||
`;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Calculates a confidence score (0-1) for the cleaned business data.
|
|
||||||
* Score is based on:
|
|
||||||
* - Valid email format (0.25)
|
|
||||||
* - Valid phone format (0.25)
|
|
||||||
* - Valid address format (0.25)
|
|
||||||
* - Description quality (0.25)
|
|
||||||
*/
|
|
||||||
private static calculateConfidenceScore(business: Business): number {
|
|
||||||
let score = 0;
|
|
||||||
|
|
||||||
// Valid email adds 0.25
|
|
||||||
if (business.email && VALID_EMAIL_REGEX.test(business.email)) {
|
|
||||||
score += 0.25;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Valid phone adds 0.25
|
|
||||||
if (business.phone && VALID_PHONE_REGEX.test(business.phone)) {
|
|
||||||
score += 0.25;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Valid address adds 0.25
|
|
||||||
if (business.address && VALID_ADDRESS_REGEX.test(business.address)) {
|
|
||||||
score += 0.25;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Description quality checks (0.25 max)
|
|
||||||
if (business.description) {
|
|
||||||
// Length check (0.1)
|
|
||||||
if (business.description.length > 30 && business.description.length < 200) {
|
|
||||||
score += 0.1;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Relevance check (0.1)
|
|
||||||
const businessType = this.getBusinessType(business.name);
|
|
||||||
if (business.description.toLowerCase().includes(businessType)) {
|
|
||||||
score += 0.1;
|
|
||||||
}
|
|
||||||
|
|
||||||
// No HTML/markdown (0.05)
|
|
||||||
if (!/[<>[\]()]/.test(business.description)) {
|
|
||||||
score += 0.05;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return score;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Determines the type of business based on name keywords.
|
|
||||||
* Used for validating and generating descriptions.
|
|
||||||
*/
|
|
||||||
private static getBusinessType(name: string): string {
|
|
||||||
const types = [
|
|
||||||
'restaurant', 'plumber', 'electrician', 'cafe', 'bar',
|
|
||||||
'salon', 'shop', 'store', 'service'
|
|
||||||
];
|
|
||||||
|
|
||||||
const nameLower = name.toLowerCase();
|
|
||||||
return types.find(type => nameLower.includes(type)) || 'business';
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Parses LLM response into structured business data.
|
|
||||||
* Expects format: "field: value" for each line.
|
|
||||||
*/
|
|
||||||
private static parseResponse(response: string): Partial<Business> {
|
|
||||||
const cleaned: Partial<Business> = {};
|
|
||||||
const lines = response.split('\n');
|
|
||||||
|
|
||||||
for (const line of lines) {
|
|
||||||
const [field, ...values] = line.split(':');
|
|
||||||
const value = values.join(':').trim();
|
|
||||||
|
|
||||||
switch (field.toLowerCase().trim()) {
|
|
||||||
case 'address':
|
|
||||||
cleaned.address = value;
|
|
||||||
break;
|
|
||||||
case 'phone':
|
|
||||||
cleaned.phone = value;
|
|
||||||
break;
|
|
||||||
case 'email':
|
|
||||||
cleaned.email = value;
|
|
||||||
break;
|
|
||||||
case 'description':
|
|
||||||
cleaned.description = value;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return cleaned;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Applies validation rules and cleaning to each field.
|
|
||||||
* - Standardizes formats
|
|
||||||
* - Removes invalid data
|
|
||||||
* - Ensures consistent formatting
|
|
||||||
*/
|
|
||||||
private static validateAndClean(business: Business): Business {
|
|
||||||
const cleaned = { ...business };
|
|
||||||
|
|
||||||
// Email validation and cleaning
|
|
||||||
if (cleaned.email) {
|
|
||||||
cleaned.email = cleaned.email
|
|
||||||
.toLowerCase()
|
|
||||||
.replace(/\[|\]|\(mailto:.*?\)/g, '')
|
|
||||||
.replace(/^\d+-\d+/, '')
|
|
||||||
.trim();
|
|
||||||
|
|
||||||
if (!VALID_EMAIL_REGEX.test(cleaned.email) ||
|
|
||||||
['none', 'n/a', 'union office', ''].includes(cleaned.email.toLowerCase())) {
|
|
||||||
cleaned.email = '';
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Phone validation and cleaning
|
|
||||||
if (cleaned.phone) {
|
|
||||||
const digits = cleaned.phone.replace(/\D/g, '');
|
|
||||||
if (digits.length === 10) {
|
|
||||||
cleaned.phone = `(${digits.slice(0,3)}) ${digits.slice(3,6)}-${digits.slice(6)}`;
|
|
||||||
} else {
|
|
||||||
cleaned.phone = '';
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Address validation and cleaning
|
|
||||||
if (cleaned.address) {
|
|
||||||
cleaned.address = cleaned.address
|
|
||||||
.replace(/^.*?(?=\d|[A-Z])/s, '')
|
|
||||||
.replace(/^(Sure!.*?:|The business.*?:|.*?address.*?:)(?:\s*\\n)*\s*/si, '')
|
|
||||||
.replace(/\s+/g, ' ')
|
|
||||||
.trim();
|
|
||||||
|
|
||||||
// Standardize state abbreviations
|
|
||||||
cleaned.address = cleaned.address.replace(/\b(Colorado|Colo|Col)\b/gi, 'CO');
|
|
||||||
}
|
|
||||||
|
|
||||||
// Description validation and cleaning
|
|
||||||
if (cleaned.description) {
|
|
||||||
cleaned.description = cleaned.description
|
|
||||||
.replace(/\$\d+(\.\d{2})?/g, '') // Remove prices
|
|
||||||
.replace(/\b(call|email|website|click|visit)\b.*$/i, '') // Remove calls to action
|
|
||||||
.replace(/\s+/g, ' ')
|
|
||||||
.trim();
|
|
||||||
|
|
||||||
const businessType = this.getBusinessType(cleaned.name);
|
|
||||||
if (businessType !== 'business' &&
|
|
||||||
!cleaned.description.toLowerCase().includes(businessType)) {
|
|
||||||
cleaned.description = `${businessType.charAt(0).toUpperCase() + businessType.slice(1)} services in the Denver area.`;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return cleaned;
|
|
||||||
}
|
|
||||||
|
|
||||||
static async cleanBusinessRecord(business: Business): Promise<Business> {
|
|
||||||
// Check cache first
|
|
||||||
const cacheKey = `clean:${business.id}`;
|
|
||||||
const cached = await db.getFromCache(cacheKey);
|
|
||||||
if (cached) {
|
|
||||||
console.log('Using cached clean data for:', business.name);
|
|
||||||
return cached;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Clean using DeepSeek
|
|
||||||
const cleaned = await DeepSeekService.cleanBusinessData(business);
|
|
||||||
const validated = this.validateAndClean({ ...business, ...cleaned });
|
|
||||||
|
|
||||||
// Only cache if confidence score is high enough
|
|
||||||
const confidence = this.calculateConfidenceScore(validated);
|
|
||||||
if (confidence >= MIN_CONFIDENCE_SCORE) {
|
|
||||||
await db.saveToCache(cacheKey, validated, 24 * 60 * 60 * 1000);
|
|
||||||
}
|
|
||||||
|
|
||||||
return validated;
|
|
||||||
}
|
|
||||||
|
|
||||||
static async cleanBusinessRecords(businesses: Business[]): Promise<Business[]> {
|
|
||||||
const cleanedBusinesses: Business[] = [];
|
|
||||||
|
|
||||||
// Process in batches
|
|
||||||
for (let i = 0; i < businesses.length; i += BATCH_SIZE) {
|
|
||||||
const batch = businesses.slice(i, i + BATCH_SIZE);
|
|
||||||
const cleanedBatch = await Promise.all(
|
|
||||||
batch.map(business => this.cleanBusinessRecord(business))
|
|
||||||
);
|
|
||||||
cleanedBusinesses.push(...cleanedBatch);
|
|
||||||
}
|
|
||||||
|
|
||||||
return cleanedBusinesses;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,107 +0,0 @@
|
|||||||
import { OllamaService } from './ollamaService';
|
|
||||||
|
|
||||||
interface ValidatedBusinessData {
|
|
||||||
name: string;
|
|
||||||
phone: string;
|
|
||||||
email: string;
|
|
||||||
address: string;
|
|
||||||
description: string;
|
|
||||||
hours?: string;
|
|
||||||
isValid: boolean;
|
|
||||||
}
|
|
||||||
|
|
||||||
export class DataValidationService {
|
|
||||||
private ollama: OllamaService;
|
|
||||||
|
|
||||||
constructor() {
|
|
||||||
this.ollama = new OllamaService();
|
|
||||||
}
|
|
||||||
|
|
||||||
async validateAndCleanData(rawText: string): Promise<ValidatedBusinessData> {
|
|
||||||
try {
|
|
||||||
const prompt = `
|
|
||||||
You are a business data validation expert. Extract and validate business information from the following text.
|
|
||||||
Return ONLY a JSON object with the following format, nothing else:
|
|
||||||
{
|
|
||||||
"name": "verified business name",
|
|
||||||
"phone": "formatted phone number or N/A",
|
|
||||||
"email": "verified email address or N/A",
|
|
||||||
"address": "verified physical address or N/A",
|
|
||||||
"description": "short business description",
|
|
||||||
"hours": "business hours if available",
|
|
||||||
"isValid": boolean
|
|
||||||
}
|
|
||||||
|
|
||||||
Rules:
|
|
||||||
1. Phone numbers should be in (XXX) XXX-XXXX format
|
|
||||||
2. Addresses should be properly formatted with street, city, state, zip
|
|
||||||
3. Remove any irrelevant text from descriptions
|
|
||||||
4. Set isValid to true only if name and at least one contact method is found
|
|
||||||
5. Clean up any obvious formatting issues
|
|
||||||
6. Validate email addresses for proper format
|
|
||||||
|
|
||||||
Text to analyze:
|
|
||||||
${rawText}
|
|
||||||
`;
|
|
||||||
|
|
||||||
const response = await this.ollama.generateResponse(prompt);
|
|
||||||
|
|
||||||
try {
|
|
||||||
// Find the JSON object in the response
|
|
||||||
const jsonMatch = response.match(/\{[\s\S]*\}/);
|
|
||||||
if (!jsonMatch) {
|
|
||||||
throw new Error('No JSON found in response');
|
|
||||||
}
|
|
||||||
|
|
||||||
const result = JSON.parse(jsonMatch[0]);
|
|
||||||
return this.validateResult(result);
|
|
||||||
} catch (parseError) {
|
|
||||||
console.error('Failed to parse Ollama response:', parseError);
|
|
||||||
throw parseError;
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Data validation failed:', error);
|
|
||||||
return {
|
|
||||||
name: 'Unknown',
|
|
||||||
phone: 'N/A',
|
|
||||||
email: 'N/A',
|
|
||||||
address: 'N/A',
|
|
||||||
description: '',
|
|
||||||
hours: '',
|
|
||||||
isValid: false
|
|
||||||
};
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private validateResult(result: any): ValidatedBusinessData {
|
|
||||||
// Ensure all required fields are present
|
|
||||||
const validated: ValidatedBusinessData = {
|
|
||||||
name: this.cleanField(result.name) || 'Unknown',
|
|
||||||
phone: this.formatPhone(result.phone) || 'N/A',
|
|
||||||
email: this.cleanField(result.email) || 'N/A',
|
|
||||||
address: this.cleanField(result.address) || 'N/A',
|
|
||||||
description: this.cleanField(result.description) || '',
|
|
||||||
hours: this.cleanField(result.hours),
|
|
||||||
isValid: Boolean(result.isValid)
|
|
||||||
};
|
|
||||||
|
|
||||||
return validated;
|
|
||||||
}
|
|
||||||
|
|
||||||
private cleanField(value: any): string {
|
|
||||||
if (!value || typeof value !== 'string') return '';
|
|
||||||
return value.trim().replace(/\s+/g, ' ');
|
|
||||||
}
|
|
||||||
|
|
||||||
private formatPhone(phone: string): string {
|
|
||||||
if (!phone || phone === 'N/A') return 'N/A';
|
|
||||||
|
|
||||||
// Extract digits
|
|
||||||
const digits = phone.replace(/\D/g, '');
|
|
||||||
if (digits.length === 10) {
|
|
||||||
return `(${digits.slice(0,3)}) ${digits.slice(3,6)}-${digits.slice(6)}`;
|
|
||||||
}
|
|
||||||
|
|
||||||
return phone;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,80 +0,0 @@
|
|||||||
import { createClient } from '@supabase/supabase-js';
|
|
||||||
import { Business } from '../types';
|
|
||||||
import env from '../../config/env';
|
|
||||||
|
|
||||||
interface PartialBusiness {
|
|
||||||
name: string;
|
|
||||||
address: string;
|
|
||||||
phone: string;
|
|
||||||
description: string;
|
|
||||||
website?: string;
|
|
||||||
rating?: number;
|
|
||||||
source?: string;
|
|
||||||
location?: {
|
|
||||||
lat: number;
|
|
||||||
lng: number;
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
export class DatabaseService {
|
|
||||||
private supabase;
|
|
||||||
|
|
||||||
constructor() {
|
|
||||||
this.supabase = createClient(env.SUPABASE_URL, env.SUPABASE_KEY);
|
|
||||||
}
|
|
||||||
|
|
||||||
async saveBusiness(business: PartialBusiness): Promise<Business> {
|
|
||||||
const { data, error } = await this.supabase
|
|
||||||
.from('businesses')
|
|
||||||
.upsert({
|
|
||||||
name: business.name,
|
|
||||||
address: business.address,
|
|
||||||
phone: business.phone,
|
|
||||||
description: business.description,
|
|
||||||
website: business.website,
|
|
||||||
source: business.source || 'deepseek',
|
|
||||||
rating: business.rating || 4.5,
|
|
||||||
location: business.location ? `(${business.location.lng},${business.location.lat})` : '(0,0)'
|
|
||||||
})
|
|
||||||
.select()
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (error) {
|
|
||||||
console.error('Error saving business:', error);
|
|
||||||
throw new Error('Failed to save business');
|
|
||||||
}
|
|
||||||
|
|
||||||
return data;
|
|
||||||
}
|
|
||||||
|
|
||||||
async findBusinessesByQuery(query: string, location: string): Promise<Business[]> {
|
|
||||||
const { data, error } = await this.supabase
|
|
||||||
.from('businesses')
|
|
||||||
.select('*')
|
|
||||||
.or(`name.ilike.%${query}%,description.ilike.%${query}%`)
|
|
||||||
.ilike('address', `%${location}%`)
|
|
||||||
.order('rating', { ascending: false });
|
|
||||||
|
|
||||||
if (error) {
|
|
||||||
console.error('Error finding businesses:', error);
|
|
||||||
throw new Error('Failed to find businesses');
|
|
||||||
}
|
|
||||||
|
|
||||||
return data || [];
|
|
||||||
}
|
|
||||||
|
|
||||||
async getBusinessById(id: string): Promise<Business | null> {
|
|
||||||
const { data, error } = await this.supabase
|
|
||||||
.from('businesses')
|
|
||||||
.select('*')
|
|
||||||
.eq('id', id)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (error) {
|
|
||||||
console.error('Error getting business:', error);
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
return data;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,285 +0,0 @@
|
|||||||
import axios from 'axios';
|
|
||||||
import EventEmitter from 'events';
|
|
||||||
import { Business } from '../types';
|
|
||||||
|
|
||||||
interface PartialBusiness {
|
|
||||||
name: string;
|
|
||||||
address: string;
|
|
||||||
phone: string;
|
|
||||||
description: string;
|
|
||||||
website?: string;
|
|
||||||
rating?: number;
|
|
||||||
}
|
|
||||||
|
|
||||||
export class DeepSeekService extends EventEmitter {
|
|
||||||
private readonly baseUrl: string;
|
|
||||||
private readonly model: string;
|
|
||||||
|
|
||||||
constructor() {
|
|
||||||
super();
|
|
||||||
this.baseUrl = process.env.OLLAMA_URL || 'http://localhost:11434';
|
|
||||||
this.model = process.env.OLLAMA_MODEL || 'deepseek-coder:6.7b';
|
|
||||||
console.log('DeepSeekService initialized with:', {
|
|
||||||
baseUrl: this.baseUrl,
|
|
||||||
model: this.model
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
async streamChat(messages: any[], onResult: (business: PartialBusiness) => Promise<void>): Promise<void> {
|
|
||||||
try {
|
|
||||||
console.log('\nStarting streaming chat request...');
|
|
||||||
|
|
||||||
// Enhanced system prompt with more explicit instructions
|
|
||||||
const enhancedMessages = [
|
|
||||||
{
|
|
||||||
role: "system",
|
|
||||||
content: `You are a business search assistant powered by Deepseek Coder. Your task is to generate sample business listings in JSON format.
|
|
||||||
|
|
||||||
When asked about businesses in a location, return business listings one at a time in this exact JSON format:
|
|
||||||
|
|
||||||
\`\`\`json
|
|
||||||
{
|
|
||||||
"name": "Example Plumbing Co",
|
|
||||||
"address": "123 Main St, Denver, CO 80202",
|
|
||||||
"phone": "(303) 555-0123",
|
|
||||||
"description": "Licensed plumbing contractor specializing in residential and commercial services",
|
|
||||||
"website": "https://exampleplumbing.com",
|
|
||||||
"rating": 4.8
|
|
||||||
}
|
|
||||||
\`\`\`
|
|
||||||
|
|
||||||
Important rules:
|
|
||||||
1. Return ONE business at a time in JSON format
|
|
||||||
2. Generate realistic but fictional business data
|
|
||||||
3. Use proper formatting for phone numbers and addresses
|
|
||||||
4. Include ratings from 1-5 stars (can use decimals)
|
|
||||||
5. When sorting by rating, return highest rated first
|
|
||||||
6. Make each business unique with different names, addresses, and phone numbers
|
|
||||||
7. Keep descriptions concise and professional
|
|
||||||
8. Use realistic website URLs based on business names
|
|
||||||
9. Return exactly the number of businesses requested`
|
|
||||||
},
|
|
||||||
...messages
|
|
||||||
];
|
|
||||||
|
|
||||||
console.log('Sending streaming request to Ollama with messages:', JSON.stringify(enhancedMessages, null, 2));
|
|
||||||
|
|
||||||
const response = await axios.post(`${this.baseUrl}/api/chat`, {
|
|
||||||
model: this.model,
|
|
||||||
messages: enhancedMessages,
|
|
||||||
stream: true,
|
|
||||||
temperature: 0.7,
|
|
||||||
max_tokens: 1000,
|
|
||||||
system: "You are a business search assistant that returns one business at a time in JSON format."
|
|
||||||
}, {
|
|
||||||
responseType: 'stream'
|
|
||||||
});
|
|
||||||
|
|
||||||
let currentJson = '';
|
|
||||||
response.data.on('data', async (chunk: Buffer) => {
|
|
||||||
const text = chunk.toString();
|
|
||||||
currentJson += text;
|
|
||||||
|
|
||||||
// Try to find and process complete JSON objects
|
|
||||||
try {
|
|
||||||
const business = await this.extractNextBusiness(currentJson);
|
|
||||||
if (business) {
|
|
||||||
currentJson = ''; // Reset for next business
|
|
||||||
await onResult(business);
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
// Continue collecting more data if JSON is incomplete
|
|
||||||
console.debug('Collecting more data for complete JSON');
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
return new Promise((resolve, reject) => {
|
|
||||||
response.data.on('end', () => resolve());
|
|
||||||
response.data.on('error', (error: Error) => reject(error));
|
|
||||||
});
|
|
||||||
|
|
||||||
} catch (error) {
|
|
||||||
console.error('\nDeepseek streaming chat error:', error);
|
|
||||||
if (error instanceof Error) {
|
|
||||||
console.error('Error stack:', error.stack);
|
|
||||||
throw new Error(`AI model streaming error: ${error.message}`);
|
|
||||||
}
|
|
||||||
throw new Error('Failed to get streaming response from AI model');
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private async extractNextBusiness(text: string): Promise<PartialBusiness | null> {
|
|
||||||
// Try to find a complete JSON object
|
|
||||||
const jsonMatch = text.match(/\{[^{]*\}/);
|
|
||||||
if (!jsonMatch) return null;
|
|
||||||
|
|
||||||
try {
|
|
||||||
const jsonStr = jsonMatch[0];
|
|
||||||
const business = JSON.parse(jsonStr);
|
|
||||||
|
|
||||||
// Validate required fields
|
|
||||||
if (!business.name || !business.address || !business.phone || !business.description) {
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
return business;
|
|
||||||
} catch (e) {
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async chat(messages: any[]): Promise<any> {
|
|
||||||
try {
|
|
||||||
console.log('\nStarting chat request...');
|
|
||||||
|
|
||||||
// Enhanced system prompt with more explicit instructions
|
|
||||||
const enhancedMessages = [
|
|
||||||
{
|
|
||||||
role: "system",
|
|
||||||
content: `You are a business search assistant powered by Deepseek Coder. Your task is to generate sample business listings in JSON format.
|
|
||||||
|
|
||||||
When asked about businesses in a location, return business listings in this exact JSON format, with no additional text or comments:
|
|
||||||
|
|
||||||
\`\`\`json
|
|
||||||
[
|
|
||||||
{
|
|
||||||
"name": "Example Plumbing Co",
|
|
||||||
"address": "123 Main St, Denver, CO 80202",
|
|
||||||
"phone": "(303) 555-0123",
|
|
||||||
"description": "Licensed plumbing contractor specializing in residential and commercial services",
|
|
||||||
"website": "https://exampleplumbing.com",
|
|
||||||
"rating": 4.8
|
|
||||||
}
|
|
||||||
]
|
|
||||||
\`\`\`
|
|
||||||
|
|
||||||
Important rules:
|
|
||||||
1. Return ONLY the JSON array inside code blocks - no explanations or comments
|
|
||||||
2. Generate realistic but fictional business data
|
|
||||||
3. Use proper formatting for phone numbers (e.g., "(303) 555-XXXX") and addresses
|
|
||||||
4. Include ratings from 1-5 stars (can use decimals, e.g., 4.8)
|
|
||||||
5. When sorting by rating, sort from highest to lowest rating
|
|
||||||
6. When asked for a specific number of results, always return exactly that many
|
|
||||||
7. Make each business unique with different names, addresses, and phone numbers
|
|
||||||
8. Keep descriptions concise and professional
|
|
||||||
9. Use realistic website URLs based on business names`
|
|
||||||
},
|
|
||||||
...messages
|
|
||||||
];
|
|
||||||
|
|
||||||
console.log('Sending request to Ollama with messages:', JSON.stringify(enhancedMessages, null, 2));
|
|
||||||
|
|
||||||
const response = await axios.post(`${this.baseUrl}/api/chat`, {
|
|
||||||
model: this.model,
|
|
||||||
messages: enhancedMessages,
|
|
||||||
stream: false,
|
|
||||||
temperature: 0.7,
|
|
||||||
max_tokens: 1000,
|
|
||||||
system: "You are a business search assistant that always responds with JSON data."
|
|
||||||
});
|
|
||||||
|
|
||||||
if (!response.data) {
|
|
||||||
throw new Error('Empty response from AI model');
|
|
||||||
}
|
|
||||||
|
|
||||||
console.log('\nRaw response data:', JSON.stringify(response.data, null, 2));
|
|
||||||
|
|
||||||
if (!response.data.message?.content) {
|
|
||||||
throw new Error('No content in AI model response');
|
|
||||||
}
|
|
||||||
|
|
||||||
console.log('\nParsing AI response...');
|
|
||||||
const results = await this.sanitizeJsonResponse(response.data.message.content);
|
|
||||||
console.log('Parsed results:', JSON.stringify(results, null, 2));
|
|
||||||
|
|
||||||
return results;
|
|
||||||
|
|
||||||
} catch (error) {
|
|
||||||
console.error('\nDeepseek chat error:', error);
|
|
||||||
if (error instanceof Error) {
|
|
||||||
console.error('Error stack:', error.stack);
|
|
||||||
throw new Error(`AI model error: ${error.message}`);
|
|
||||||
}
|
|
||||||
throw new Error('Failed to get response from AI model');
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private async sanitizeJsonResponse(text: string): Promise<PartialBusiness[]> {
|
|
||||||
console.log('Attempting to parse response:', text);
|
|
||||||
|
|
||||||
// First try to find JSON blocks
|
|
||||||
const jsonBlockMatch = text.match(/```(?:json)?\s*([\s\S]*?)\s*```/);
|
|
||||||
if (jsonBlockMatch) {
|
|
||||||
try {
|
|
||||||
const jsonStr = jsonBlockMatch[1].trim();
|
|
||||||
console.log('Found JSON block:', jsonStr);
|
|
||||||
const parsed = JSON.parse(jsonStr);
|
|
||||||
return Array.isArray(parsed) ? parsed : [parsed];
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to parse JSON block:', e);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Then try to find any JSON-like structure
|
|
||||||
const jsonPatterns = [
|
|
||||||
/\[\s*\{[\s\S]*\}\s*\]/, // Array of objects
|
|
||||||
/\{[\s\S]*\}/ // Single object
|
|
||||||
];
|
|
||||||
|
|
||||||
for (const pattern of jsonPatterns) {
|
|
||||||
const match = text.match(pattern);
|
|
||||||
if (match) {
|
|
||||||
try {
|
|
||||||
const jsonStr = match[0].trim();
|
|
||||||
console.log('Found JSON pattern:', jsonStr);
|
|
||||||
const parsed = JSON.parse(jsonStr);
|
|
||||||
return Array.isArray(parsed) ? parsed : [parsed];
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to parse JSON pattern:', e);
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// If no valid JSON found, try to extract structured data
|
|
||||||
try {
|
|
||||||
const extractedData = this.extractBusinessData(text);
|
|
||||||
if (extractedData) {
|
|
||||||
console.log('Extracted business data:', extractedData);
|
|
||||||
return [extractedData];
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to extract business data:', e);
|
|
||||||
}
|
|
||||||
|
|
||||||
throw new Error('No valid JSON or business information found in response');
|
|
||||||
}
|
|
||||||
|
|
||||||
private extractBusinessData(text: string): PartialBusiness {
|
|
||||||
// Extract business information using regex patterns
|
|
||||||
const businessInfo: PartialBusiness = {
|
|
||||||
name: this.extractField(text, 'name', '[^"\\n]+') || 'Unknown Business',
|
|
||||||
address: this.extractField(text, 'address', '[^"\\n]+') || 'Address not available',
|
|
||||||
phone: this.extractField(text, 'phone', '[^"\\n]+') || 'Phone not available',
|
|
||||||
description: this.extractField(text, 'description', '[^"\\n]+') || 'No description available'
|
|
||||||
};
|
|
||||||
|
|
||||||
const website = this.extractField(text, 'website', '[^"\\n]+');
|
|
||||||
if (website) {
|
|
||||||
businessInfo.website = website;
|
|
||||||
}
|
|
||||||
|
|
||||||
const rating = this.extractField(text, 'rating', '[0-9.]+');
|
|
||||||
if (rating) {
|
|
||||||
businessInfo.rating = parseFloat(rating);
|
|
||||||
}
|
|
||||||
|
|
||||||
return businessInfo;
|
|
||||||
}
|
|
||||||
|
|
||||||
private extractField(text: string, field: string, pattern: string): string {
|
|
||||||
const regex = new RegExp(`"?${field}"?\\s*[:=]\\s*"?(${pattern})"?`, 'i');
|
|
||||||
const match = text.match(regex);
|
|
||||||
return match ? match[1].trim() : '';
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,63 +0,0 @@
|
|||||||
import axios from 'axios';
|
|
||||||
import { sleep } from '../utils/helpers';
|
|
||||||
|
|
||||||
interface GeocodingResult {
|
|
||||||
lat: number;
|
|
||||||
lng: number;
|
|
||||||
formattedAddress: string;
|
|
||||||
}
|
|
||||||
|
|
||||||
export class GeocodingService {
|
|
||||||
private static cache = new Map<string, GeocodingResult>();
|
|
||||||
private static lastRequestTime = 0;
|
|
||||||
private static RATE_LIMIT_MS = 1000; // 1 second between requests (Nominatim requirement)
|
|
||||||
|
|
||||||
static async geocode(address: string): Promise<GeocodingResult | null> {
|
|
||||||
// Check cache first
|
|
||||||
const cached = this.cache.get(address);
|
|
||||||
if (cached) return cached;
|
|
||||||
|
|
||||||
try {
|
|
||||||
// Rate limiting
|
|
||||||
const now = Date.now();
|
|
||||||
const timeSinceLastRequest = now - this.lastRequestTime;
|
|
||||||
if (timeSinceLastRequest < this.RATE_LIMIT_MS) {
|
|
||||||
await sleep(this.RATE_LIMIT_MS - timeSinceLastRequest);
|
|
||||||
}
|
|
||||||
this.lastRequestTime = Date.now();
|
|
||||||
|
|
||||||
const response = await axios.get(
|
|
||||||
'https://nominatim.openstreetmap.org/search',
|
|
||||||
{
|
|
||||||
params: {
|
|
||||||
q: address,
|
|
||||||
format: 'json',
|
|
||||||
limit: 1,
|
|
||||||
addressdetails: 1
|
|
||||||
},
|
|
||||||
headers: {
|
|
||||||
'User-Agent': 'BusinessFinder/1.0'
|
|
||||||
}
|
|
||||||
}
|
|
||||||
);
|
|
||||||
|
|
||||||
if (response.data?.length > 0) {
|
|
||||||
const result = response.data[0];
|
|
||||||
const geocoded = {
|
|
||||||
lat: parseFloat(result.lat),
|
|
||||||
lng: parseFloat(result.lon),
|
|
||||||
formattedAddress: result.display_name
|
|
||||||
};
|
|
||||||
|
|
||||||
// Cache the result
|
|
||||||
this.cache.set(address, geocoded);
|
|
||||||
return geocoded;
|
|
||||||
}
|
|
||||||
|
|
||||||
return null;
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Geocoding error:', error);
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,40 +0,0 @@
|
|||||||
import axios from 'axios';
|
|
||||||
import { supabase } from '../supabase';
|
|
||||||
import { env } from '../../config/env';
|
|
||||||
|
|
||||||
export class HealthCheckService {
|
|
||||||
private static async checkSupabase(): Promise<boolean> {
|
|
||||||
try {
|
|
||||||
const { data, error } = await supabase.from('searches').select('count');
|
|
||||||
return !error;
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Supabase health check failed:', error);
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private static async checkSearx(): Promise<boolean> {
|
|
||||||
try {
|
|
||||||
const response = await axios.get(env.SEARXNG_URL);
|
|
||||||
return response.status === 200;
|
|
||||||
} catch (error) {
|
|
||||||
console.error('SearxNG health check failed:', error);
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
public static async checkHealth(): Promise<{
|
|
||||||
supabase: boolean;
|
|
||||||
searx: boolean;
|
|
||||||
}> {
|
|
||||||
const [supabaseHealth, searxHealth] = await Promise.all([
|
|
||||||
this.checkSupabase(),
|
|
||||||
this.checkSearx()
|
|
||||||
]);
|
|
||||||
|
|
||||||
return {
|
|
||||||
supabase: supabaseHealth,
|
|
||||||
searx: searxHealth
|
|
||||||
};
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,45 +0,0 @@
|
|||||||
import axios from 'axios';
|
|
||||||
import { env } from '../../config/env';
|
|
||||||
|
|
||||||
export class OllamaService {
|
|
||||||
private static readonly baseUrl = env.ollama.url;
|
|
||||||
private static readonly model = env.ollama.model;
|
|
||||||
|
|
||||||
static async complete(prompt: string): Promise<string> {
|
|
||||||
try {
|
|
||||||
const response = await axios.post(`${this.baseUrl}/api/generate`, {
|
|
||||||
model: this.model,
|
|
||||||
prompt: prompt,
|
|
||||||
stream: false
|
|
||||||
});
|
|
||||||
|
|
||||||
if (response.data?.response) {
|
|
||||||
return response.data.response;
|
|
||||||
}
|
|
||||||
|
|
||||||
throw new Error('No response from Ollama');
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Ollama error:', error);
|
|
||||||
throw error;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static async chat(messages: { role: 'user' | 'assistant'; content: string }[]): Promise<string> {
|
|
||||||
try {
|
|
||||||
const response = await axios.post(`${this.baseUrl}/api/chat`, {
|
|
||||||
model: this.model,
|
|
||||||
messages: messages,
|
|
||||||
stream: false
|
|
||||||
});
|
|
||||||
|
|
||||||
if (response.data?.message?.content) {
|
|
||||||
return response.data.message.content;
|
|
||||||
}
|
|
||||||
|
|
||||||
throw new Error('No response from Ollama chat');
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Ollama chat error:', error);
|
|
||||||
throw error;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,135 +0,0 @@
|
|||||||
import EventEmitter from 'events';
|
|
||||||
import { DeepSeekService } from './deepseekService';
|
|
||||||
import { DatabaseService } from './databaseService';
|
|
||||||
import { Business } from '../types';
|
|
||||||
|
|
||||||
interface PartialBusiness {
|
|
||||||
name: string;
|
|
||||||
address: string;
|
|
||||||
phone: string;
|
|
||||||
description: string;
|
|
||||||
website?: string;
|
|
||||||
rating?: number;
|
|
||||||
source?: string;
|
|
||||||
location?: {
|
|
||||||
lat: number;
|
|
||||||
lng: number;
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
export class SearchService extends EventEmitter {
|
|
||||||
private deepseekService: DeepSeekService;
|
|
||||||
private databaseService: DatabaseService;
|
|
||||||
|
|
||||||
constructor() {
|
|
||||||
super();
|
|
||||||
this.deepseekService = new DeepSeekService();
|
|
||||||
this.databaseService = new DatabaseService();
|
|
||||||
|
|
||||||
this.deepseekService.on('progress', (data) => {
|
|
||||||
this.emit('progress', data);
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
async streamSearch(query: string, location: string, limit: number = 10): Promise<void> {
|
|
||||||
try {
|
|
||||||
// First, try to find cached results in database
|
|
||||||
const cachedResults = await this.databaseService.findBusinessesByQuery(query, location);
|
|
||||||
if (cachedResults.length > 0) {
|
|
||||||
// Emit cached results one by one
|
|
||||||
for (const result of this.sortByRating(cachedResults).slice(0, limit)) {
|
|
||||||
this.emit('result', result);
|
|
||||||
await new Promise(resolve => setTimeout(resolve, 100)); // Small delay between results
|
|
||||||
}
|
|
||||||
this.emit('complete');
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
// If no cached results, use DeepSeek to generate new results
|
|
||||||
const aiResults = await this.deepseekService.streamChat([{
|
|
||||||
role: "user",
|
|
||||||
content: `Find ${query} in ${location}. You must return exactly ${limit} results in valid JSON format, sorted by rating from highest to lowest. Each result must include a rating between 1-5 stars. Do not include any comments or explanations in the JSON.`
|
|
||||||
}], async (business: PartialBusiness) => {
|
|
||||||
try {
|
|
||||||
// Extract lat/lng from address using a geocoding service
|
|
||||||
const coords = await this.geocodeAddress(business.address);
|
|
||||||
|
|
||||||
// Save to database and emit result
|
|
||||||
const savedBusiness = await this.databaseService.saveBusiness({
|
|
||||||
...business,
|
|
||||||
source: 'deepseek',
|
|
||||||
location: coords || {
|
|
||||||
lat: 39.7392, // Denver's default coordinates
|
|
||||||
lng: -104.9903
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
this.emit('result', savedBusiness);
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Error processing business:', error);
|
|
||||||
this.emit('error', error);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
this.emit('complete');
|
|
||||||
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Search error:', error);
|
|
||||||
this.emit('error', error);
|
|
||||||
throw error;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async search(query: string, location: string, limit: number = 10): Promise<Business[]> {
|
|
||||||
try {
|
|
||||||
// First, try to find cached results in database
|
|
||||||
const cachedResults = await this.databaseService.findBusinessesByQuery(query, location);
|
|
||||||
if (cachedResults.length > 0) {
|
|
||||||
return this.sortByRating(cachedResults).slice(0, limit);
|
|
||||||
}
|
|
||||||
|
|
||||||
// If no cached results, use DeepSeek to generate new results
|
|
||||||
const aiResults = await this.deepseekService.chat([{
|
|
||||||
role: "user",
|
|
||||||
content: `Find ${query} in ${location}. You must return exactly ${limit} results in valid JSON format, sorted by rating from highest to lowest. Each result must include a rating between 1-5 stars. Do not include any comments or explanations in the JSON.`
|
|
||||||
}]);
|
|
||||||
|
|
||||||
// Save the results to database
|
|
||||||
const savedResults = await Promise.all(
|
|
||||||
(aiResults as PartialBusiness[]).map(async (business: PartialBusiness) => {
|
|
||||||
// Extract lat/lng from address using a geocoding service
|
|
||||||
const coords = await this.geocodeAddress(business.address);
|
|
||||||
|
|
||||||
return this.databaseService.saveBusiness({
|
|
||||||
...business,
|
|
||||||
source: 'deepseek',
|
|
||||||
location: coords || {
|
|
||||||
lat: 39.7392, // Denver's default coordinates
|
|
||||||
lng: -104.9903
|
|
||||||
}
|
|
||||||
});
|
|
||||||
})
|
|
||||||
);
|
|
||||||
|
|
||||||
return this.sortByRating(savedResults);
|
|
||||||
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Search error:', error);
|
|
||||||
throw error;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private sortByRating(businesses: Business[]): Business[] {
|
|
||||||
return businesses.sort((a, b) => b.rating - a.rating);
|
|
||||||
}
|
|
||||||
|
|
||||||
private async geocodeAddress(address: string): Promise<{ lat: number; lng: number } | null> {
|
|
||||||
// TODO: Implement real geocoding service
|
|
||||||
// For now, return null to use default coordinates
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
async getBusinessById(id: string): Promise<Business | null> {
|
|
||||||
return this.databaseService.getBusinessById(id);
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,93 +0,0 @@
|
|||||||
import { createClient } from '@supabase/supabase-js';
|
|
||||||
import { env } from '../../config/env';
|
|
||||||
import { BusinessData } from '../searxng';
|
|
||||||
|
|
||||||
export class SupabaseService {
|
|
||||||
private supabase;
|
|
||||||
|
|
||||||
constructor() {
|
|
||||||
this.supabase = createClient(env.supabase.url, env.supabase.anonKey);
|
|
||||||
}
|
|
||||||
|
|
||||||
async upsertBusinesses(businesses: BusinessData[]): Promise<void> {
|
|
||||||
try {
|
|
||||||
console.log('Upserting businesses to Supabase:', businesses.length);
|
|
||||||
|
|
||||||
for (const business of businesses) {
|
|
||||||
try {
|
|
||||||
// Create a unique identifier based on multiple properties
|
|
||||||
const identifier = [
|
|
||||||
business.name.toLowerCase(),
|
|
||||||
business.phone?.replace(/\D/g, ''),
|
|
||||||
business.address?.toLowerCase(),
|
|
||||||
business.website?.toLowerCase()
|
|
||||||
]
|
|
||||||
.filter(Boolean) // Remove empty values
|
|
||||||
.join('_') // Join with underscore
|
|
||||||
.replace(/[^a-z0-9]/g, '_'); // Replace non-alphanumeric chars
|
|
||||||
|
|
||||||
// Log the data being inserted
|
|
||||||
console.log('Upserting business:', {
|
|
||||||
id: identifier,
|
|
||||||
name: business.name,
|
|
||||||
phone: business.phone,
|
|
||||||
email: business.email,
|
|
||||||
address: business.address,
|
|
||||||
rating: business.rating,
|
|
||||||
website: business.website,
|
|
||||||
location: business.location
|
|
||||||
});
|
|
||||||
|
|
||||||
// Check if business exists
|
|
||||||
const { data: existing, error: selectError } = await this.supabase
|
|
||||||
.from('businesses')
|
|
||||||
.select('rating, search_count')
|
|
||||||
.eq('id', identifier)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (selectError && selectError.code !== 'PGRST116') {
|
|
||||||
console.error('Error checking existing business:', selectError);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Prepare upsert data
|
|
||||||
const upsertData = {
|
|
||||||
id: identifier,
|
|
||||||
name: business.name,
|
|
||||||
phone: business.phone || null,
|
|
||||||
email: business.email || null,
|
|
||||||
address: business.address || null,
|
|
||||||
rating: existing ? Math.max(business.rating, existing.rating) : business.rating,
|
|
||||||
website: business.website || null,
|
|
||||||
logo: business.logo || null,
|
|
||||||
source: business.source || null,
|
|
||||||
description: business.description || null,
|
|
||||||
latitude: business.location?.lat || null,
|
|
||||||
longitude: business.location?.lng || null,
|
|
||||||
last_updated: new Date().toISOString(),
|
|
||||||
search_count: existing ? existing.search_count + 1 : 1
|
|
||||||
};
|
|
||||||
|
|
||||||
console.log('Upserting with data:', upsertData);
|
|
||||||
|
|
||||||
const { error: upsertError } = await this.supabase
|
|
||||||
.from('businesses')
|
|
||||||
.upsert(upsertData, {
|
|
||||||
onConflict: 'id'
|
|
||||||
});
|
|
||||||
|
|
||||||
if (upsertError) {
|
|
||||||
console.error('Error upserting business:', upsertError);
|
|
||||||
console.error('Failed business data:', upsertData);
|
|
||||||
} else {
|
|
||||||
console.log(`Successfully upserted business: ${business.name}`);
|
|
||||||
}
|
|
||||||
} catch (businessError) {
|
|
||||||
console.error('Error processing business:', business.name, businessError);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Error saving businesses to Supabase:', error);
|
|
||||||
throw error;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,35 +0,0 @@
|
|||||||
import { createClient } from '@supabase/supabase-js';
|
|
||||||
import { env } from '../config/env';
|
|
||||||
|
|
||||||
// Validate Supabase configuration
|
|
||||||
if (!env.SUPABASE_URL || !env.SUPABASE_KEY) {
|
|
||||||
throw new Error('Missing Supabase configuration');
|
|
||||||
}
|
|
||||||
|
|
||||||
// Create Supabase client
|
|
||||||
export const supabase = createClient(
|
|
||||||
env.SUPABASE_URL,
|
|
||||||
env.SUPABASE_KEY,
|
|
||||||
{
|
|
||||||
auth: {
|
|
||||||
autoRefreshToken: true,
|
|
||||||
persistSession: true,
|
|
||||||
detectSessionInUrl: true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
);
|
|
||||||
|
|
||||||
// Test connection function
|
|
||||||
export async function testConnection() {
|
|
||||||
try {
|
|
||||||
console.log('Testing Supabase connection...');
|
|
||||||
console.log('URL:', env.SUPABASE_URL);
|
|
||||||
const { data, error } = await supabase.from('searches').select('count');
|
|
||||||
if (error) throw error;
|
|
||||||
console.log('Supabase connection successful');
|
|
||||||
return true;
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Supabase connection failed:', error);
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,16 +0,0 @@
|
|||||||
export interface Business {
|
|
||||||
id: string;
|
|
||||||
name: string;
|
|
||||||
address: string;
|
|
||||||
phone: string;
|
|
||||||
description: string;
|
|
||||||
website?: string;
|
|
||||||
source: string;
|
|
||||||
rating: number;
|
|
||||||
location: {
|
|
||||||
lat: number;
|
|
||||||
lng: number;
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
export type BusinessData = Business;
|
|
@@ -1,39 +0,0 @@
|
|||||||
import crypto from 'crypto';
|
|
||||||
|
|
||||||
interface BusinessIdentifier {
|
|
||||||
title?: string;
|
|
||||||
name?: string;
|
|
||||||
phone?: string;
|
|
||||||
address?: string;
|
|
||||||
url?: string;
|
|
||||||
website?: string;
|
|
||||||
}
|
|
||||||
|
|
||||||
export function generateBusinessId(business: BusinessIdentifier): string {
|
|
||||||
const components = [
|
|
||||||
business.title || business.name,
|
|
||||||
business.phone,
|
|
||||||
business.address,
|
|
||||||
business.url || business.website
|
|
||||||
].filter(Boolean);
|
|
||||||
|
|
||||||
const hash = crypto.createHash('md5')
|
|
||||||
.update(components.join('|'))
|
|
||||||
.digest('hex');
|
|
||||||
|
|
||||||
return `hash_${hash}`;
|
|
||||||
}
|
|
||||||
|
|
||||||
export function extractPlaceIdFromUrl(url: string): string | null {
|
|
||||||
try {
|
|
||||||
// Match patterns like:
|
|
||||||
// https://www.google.com/maps/place/.../.../data=!3m1!4b1!4m5!3m4!1s0x876c7ed0cb78d6d3:0x2cd0c4490736f7c!8m2!
|
|
||||||
// https://maps.google.com/maps?q=...&ftid=0x876c7ed0cb78d6d3:0x2cd0c4490736f7c
|
|
||||||
const placeIdRegex = /[!\/]([0-9a-f]{16}:[0-9a-f]{16})/i;
|
|
||||||
const match = url.match(placeIdRegex);
|
|
||||||
return match ? match[1] : null;
|
|
||||||
} catch (error) {
|
|
||||||
console.warn('Error extracting place ID from URL:', error);
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,36 +0,0 @@
|
|||||||
interface CacheItem<T> {
|
|
||||||
data: T;
|
|
||||||
timestamp: number;
|
|
||||||
}
|
|
||||||
|
|
||||||
export class Cache<T> {
|
|
||||||
private store = new Map<string, CacheItem<T>>();
|
|
||||||
private ttl: number;
|
|
||||||
|
|
||||||
constructor(ttlMinutes: number = 60) {
|
|
||||||
this.ttl = ttlMinutes * 60 * 1000;
|
|
||||||
}
|
|
||||||
|
|
||||||
set(key: string, value: T): void {
|
|
||||||
this.store.set(key, {
|
|
||||||
data: value,
|
|
||||||
timestamp: Date.now()
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
get(key: string): T | null {
|
|
||||||
const item = this.store.get(key);
|
|
||||||
if (!item) return null;
|
|
||||||
|
|
||||||
if (Date.now() - item.timestamp > this.ttl) {
|
|
||||||
this.store.delete(key);
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
return item.data;
|
|
||||||
}
|
|
||||||
|
|
||||||
clear(): void {
|
|
||||||
this.store.clear();
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,67 +0,0 @@
|
|||||||
import { Business } from '../types';
|
|
||||||
|
|
||||||
export function normalizePhoneNumber(phone: string): string {
|
|
||||||
return phone.replace(/[^\d]/g, '');
|
|
||||||
}
|
|
||||||
|
|
||||||
export function normalizeAddress(address: string): string {
|
|
||||||
// Remove common suffixes and standardize format
|
|
||||||
return address
|
|
||||||
.toLowerCase()
|
|
||||||
.replace(/(street|st\.?|avenue|ave\.?|road|rd\.?)/g, '')
|
|
||||||
.trim();
|
|
||||||
}
|
|
||||||
|
|
||||||
export function extractZipCode(text: string): string | null {
|
|
||||||
const match = text.match(/\b\d{5}(?:-\d{4})?\b/);
|
|
||||||
return match ? match[0] : null;
|
|
||||||
}
|
|
||||||
|
|
||||||
export function calculateReliabilityScore(business: Business): number {
|
|
||||||
let score = 0;
|
|
||||||
|
|
||||||
// More complete data = higher score
|
|
||||||
if (business.phone) score += 2;
|
|
||||||
if (business.website) score += 1;
|
|
||||||
if (business.email) score += 1;
|
|
||||||
if (business.hours?.length) score += 2;
|
|
||||||
if (business.services && business.services.length > 0) score += 1;
|
|
||||||
if (business.reviewCount && business.reviewCount > 10) score += 2;
|
|
||||||
|
|
||||||
return score;
|
|
||||||
}
|
|
||||||
|
|
||||||
export function cleanAddress(address: string): string {
|
|
||||||
return address
|
|
||||||
.replace(/^(Sure!|Here is |The business address( is| found in the text is)?:?\n?\s*)/i, '')
|
|
||||||
.replace(/\n/g, ' ')
|
|
||||||
.trim();
|
|
||||||
}
|
|
||||||
|
|
||||||
export function formatPhoneNumber(phone: string): string {
|
|
||||||
// Remove all non-numeric characters
|
|
||||||
const cleaned = phone.replace(/\D/g, '');
|
|
||||||
|
|
||||||
// Format as (XXX) XXX-XXXX
|
|
||||||
if (cleaned.length === 10) {
|
|
||||||
return `(${cleaned.slice(0,3)}) ${cleaned.slice(3,6)}-${cleaned.slice(6)}`;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Return original if not 10 digits
|
|
||||||
return phone;
|
|
||||||
}
|
|
||||||
|
|
||||||
export function cleanEmail(email: string): string {
|
|
||||||
// Remove phone numbers from email
|
|
||||||
return email
|
|
||||||
.replace(/\d{3}-\d{4}/, '')
|
|
||||||
.replace(/\d{10}/, '')
|
|
||||||
.trim();
|
|
||||||
}
|
|
||||||
|
|
||||||
export function cleanDescription(description: string): string {
|
|
||||||
return description
|
|
||||||
.replace(/^(Description:|About:|Info:)/i, '')
|
|
||||||
.replace(/\s+/g, ' ')
|
|
||||||
.trim();
|
|
||||||
}
|
|
@@ -1,18 +0,0 @@
|
|||||||
export function sleep(ms: number): Promise<void> {
|
|
||||||
return new Promise(resolve => setTimeout(resolve, ms));
|
|
||||||
}
|
|
||||||
|
|
||||||
export function cleanText(text: string): string {
|
|
||||||
return text
|
|
||||||
.replace(/\s+/g, ' ')
|
|
||||||
.replace(/[^\w\s-.,]/g, '')
|
|
||||||
.trim();
|
|
||||||
}
|
|
||||||
|
|
||||||
export function isValidPhone(phone: string): boolean {
|
|
||||||
return /^\+?[\d-.()\s]{10,}$/.test(phone);
|
|
||||||
}
|
|
||||||
|
|
||||||
export function isValidEmail(email: string): boolean {
|
|
||||||
return /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email);
|
|
||||||
}
|
|
@@ -1,23 +0,0 @@
|
|||||||
export class RateLimiter {
|
|
||||||
private timestamps: number[] = [];
|
|
||||||
private readonly windowMs: number;
|
|
||||||
private readonly maxRequests: number;
|
|
||||||
|
|
||||||
constructor(windowMs: number = 60000, maxRequests: number = 30) {
|
|
||||||
this.windowMs = windowMs;
|
|
||||||
this.maxRequests = maxRequests;
|
|
||||||
}
|
|
||||||
|
|
||||||
async waitForSlot(): Promise<void> {
|
|
||||||
const now = Date.now();
|
|
||||||
this.timestamps = this.timestamps.filter(time => now - time < this.windowMs);
|
|
||||||
|
|
||||||
if (this.timestamps.length >= this.maxRequests) {
|
|
||||||
const oldestRequest = this.timestamps[0];
|
|
||||||
const waitTime = this.windowMs - (now - oldestRequest);
|
|
||||||
await new Promise(resolve => setTimeout(resolve, waitTime));
|
|
||||||
}
|
|
||||||
|
|
||||||
this.timestamps.push(now);
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,168 +0,0 @@
|
|||||||
import axios from 'axios';
|
|
||||||
import * as cheerio from 'cheerio';
|
|
||||||
import { OllamaService } from '../services/ollamaService';
|
|
||||||
import { sleep } from './helpers';
|
|
||||||
|
|
||||||
const RATE_LIMIT_MS = 1000; // 1 second between requests
|
|
||||||
let lastRequestTime = 0;
|
|
||||||
|
|
||||||
async function rateLimitedRequest(url: string) {
|
|
||||||
const now = Date.now();
|
|
||||||
const timeSinceLastRequest = now - lastRequestTime;
|
|
||||||
|
|
||||||
if (timeSinceLastRequest < RATE_LIMIT_MS) {
|
|
||||||
await sleep(RATE_LIMIT_MS - timeSinceLastRequest);
|
|
||||||
}
|
|
||||||
|
|
||||||
lastRequestTime = Date.now();
|
|
||||||
return axios.get(url, {
|
|
||||||
timeout: 5000,
|
|
||||||
headers: {
|
|
||||||
'User-Agent': 'Mozilla/5.0 (compatible; BusinessFinder/1.0; +http://example.com/bot)',
|
|
||||||
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
|
|
||||||
'Accept-Language': 'en-US,en;q=0.5'
|
|
||||||
}
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface ContactInfo {
|
|
||||||
phone?: string;
|
|
||||||
email?: string;
|
|
||||||
address?: string;
|
|
||||||
description?: string;
|
|
||||||
openingHours?: string[];
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function extractContactFromHtml(url: string): Promise<ContactInfo> {
|
|
||||||
try {
|
|
||||||
const response = await rateLimitedRequest(url);
|
|
||||||
|
|
||||||
const $ = cheerio.load(response.data);
|
|
||||||
|
|
||||||
// Extract structured data if available
|
|
||||||
const structuredData = $('script[type="application/ld+json"]')
|
|
||||||
.map((_, el) => {
|
|
||||||
try {
|
|
||||||
return JSON.parse($(el).html() || '');
|
|
||||||
} catch {
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
})
|
|
||||||
.get()
|
|
||||||
.filter(Boolean);
|
|
||||||
|
|
||||||
// Look for LocalBusiness or Restaurant schema
|
|
||||||
const businessData = structuredData.find(data =>
|
|
||||||
data['@type'] === 'LocalBusiness' ||
|
|
||||||
data['@type'] === 'Restaurant'
|
|
||||||
);
|
|
||||||
|
|
||||||
if (businessData) {
|
|
||||||
return {
|
|
||||||
phone: businessData.telephone,
|
|
||||||
email: businessData.email,
|
|
||||||
address: businessData.address?.streetAddress,
|
|
||||||
description: businessData.description,
|
|
||||||
openingHours: businessData.openingHours
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
// Fallback to regular HTML parsing
|
|
||||||
return {
|
|
||||||
phone: findPhone($),
|
|
||||||
email: findEmail($),
|
|
||||||
address: findAddress($),
|
|
||||||
description: $('meta[name="description"]').attr('content'),
|
|
||||||
openingHours: findOpeningHours($)
|
|
||||||
};
|
|
||||||
} catch (error) {
|
|
||||||
console.warn(`Error extracting contact info from ${url}:`, error);
|
|
||||||
return {};
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function extractCleanAddress(text: string, location: string): Promise<string> {
|
|
||||||
try {
|
|
||||||
const ollama = new OllamaService();
|
|
||||||
const prompt = `
|
|
||||||
Extract a business address from this text. The business should be in or near ${location}.
|
|
||||||
Only return the address, nothing else. If no valid address is found, return an empty string.
|
|
||||||
|
|
||||||
Text: ${text}
|
|
||||||
`;
|
|
||||||
|
|
||||||
const response = await OllamaService.complete(prompt);
|
|
||||||
return response.trim();
|
|
||||||
} catch (error) {
|
|
||||||
console.warn('Error extracting address:', error);
|
|
||||||
return '';
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Helper functions
|
|
||||||
function findPhone($: cheerio.CheerioAPI): string | undefined {
|
|
||||||
// Common phone patterns
|
|
||||||
const phonePatterns = [
|
|
||||||
/\b\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})\b/,
|
|
||||||
/\b(?:Phone|Tel|Contact):\s*([0-9-().+ ]{10,})\b/i
|
|
||||||
];
|
|
||||||
|
|
||||||
for (const pattern of phonePatterns) {
|
|
||||||
const match = $.text().match(pattern);
|
|
||||||
if (match) return match[0];
|
|
||||||
}
|
|
||||||
|
|
||||||
return undefined;
|
|
||||||
}
|
|
||||||
|
|
||||||
function findEmail($: cheerio.CheerioAPI): string | undefined {
|
|
||||||
const emailPattern = /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/;
|
|
||||||
const match = $.text().match(emailPattern);
|
|
||||||
return match ? match[0] : undefined;
|
|
||||||
}
|
|
||||||
|
|
||||||
function findAddress($: cheerio.CheerioAPI): string | undefined {
|
|
||||||
// Look for address in common elements
|
|
||||||
const addressSelectors = [
|
|
||||||
'address',
|
|
||||||
'[itemtype="http://schema.org/PostalAddress"]',
|
|
||||||
'.address',
|
|
||||||
'#address',
|
|
||||||
'[class*="address"]',
|
|
||||||
'[id*="address"]'
|
|
||||||
];
|
|
||||||
|
|
||||||
for (const selector of addressSelectors) {
|
|
||||||
const element = $(selector).first();
|
|
||||||
if (element.length) {
|
|
||||||
return element.text().trim();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return undefined;
|
|
||||||
}
|
|
||||||
|
|
||||||
function findOpeningHours($: cheerio.CheerioAPI): string[] {
|
|
||||||
const hours: string[] = [];
|
|
||||||
const hoursSelectors = [
|
|
||||||
'[itemtype="http://schema.org/OpeningHoursSpecification"]',
|
|
||||||
'.hours',
|
|
||||||
'#hours',
|
|
||||||
'[class*="hours"]',
|
|
||||||
'[id*="hours"]'
|
|
||||||
];
|
|
||||||
|
|
||||||
for (const selector of hoursSelectors) {
|
|
||||||
const element = $(selector).first();
|
|
||||||
if (element.length) {
|
|
||||||
element.find('*').each((_, el) => {
|
|
||||||
const text = $(el).text().trim();
|
|
||||||
if (text && !hours.includes(text)) {
|
|
||||||
hours.push(text);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return hours;
|
|
||||||
}
|
|
@@ -1,119 +0,0 @@
|
|||||||
import * as cheerio from 'cheerio';
|
|
||||||
|
|
||||||
interface StructuredData {
|
|
||||||
name?: string;
|
|
||||||
email?: string;
|
|
||||||
phone?: string;
|
|
||||||
address?: string;
|
|
||||||
socialProfiles?: string[];
|
|
||||||
openingHours?: Record<string, string>;
|
|
||||||
description?: string;
|
|
||||||
}
|
|
||||||
|
|
||||||
export class StructuredDataParser {
|
|
||||||
static parse($: cheerio.CheerioAPI): StructuredData[] {
|
|
||||||
const results: StructuredData[] = [];
|
|
||||||
|
|
||||||
// Parse JSON-LD
|
|
||||||
$('script[type="application/ld+json"]').each((_, element) => {
|
|
||||||
try {
|
|
||||||
const data = JSON.parse($(element).html() || '{}');
|
|
||||||
if (Array.isArray(data)) {
|
|
||||||
data.forEach(item => this.parseStructuredItem(item, results));
|
|
||||||
} else {
|
|
||||||
this.parseStructuredItem(data, results);
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Error parsing JSON-LD:', e);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Parse microdata
|
|
||||||
$('[itemtype]').each((_, element) => {
|
|
||||||
const type = $(element).attr('itemtype');
|
|
||||||
if (type?.includes('Organization') || type?.includes('LocalBusiness')) {
|
|
||||||
const data: StructuredData = {
|
|
||||||
name: $('[itemprop="name"]', element).text(),
|
|
||||||
email: $('[itemprop="email"]', element).text(),
|
|
||||||
phone: $('[itemprop="telephone"]', element).text(),
|
|
||||||
address: this.extractMicrodataAddress($, element),
|
|
||||||
socialProfiles: this.extractSocialProfiles($, element)
|
|
||||||
};
|
|
||||||
results.push(data);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Parse RDFa
|
|
||||||
$('[typeof="Organization"], [typeof="LocalBusiness"]').each((_, element) => {
|
|
||||||
const data: StructuredData = {
|
|
||||||
name: $('[property="name"]', element).text(),
|
|
||||||
email: $('[property="email"]', element).text(),
|
|
||||||
phone: $('[property="telephone"]', element).text(),
|
|
||||||
address: this.extractRdfaAddress($, element),
|
|
||||||
socialProfiles: this.extractSocialProfiles($, element)
|
|
||||||
};
|
|
||||||
results.push(data);
|
|
||||||
});
|
|
||||||
|
|
||||||
return results;
|
|
||||||
}
|
|
||||||
|
|
||||||
private static parseStructuredItem(data: any, results: StructuredData[]): void {
|
|
||||||
if (data['@type'] === 'Organization' || data['@type'] === 'LocalBusiness') {
|
|
||||||
results.push({
|
|
||||||
name: data.name,
|
|
||||||
email: data.email,
|
|
||||||
phone: data.telephone,
|
|
||||||
address: this.formatAddress(data.address),
|
|
||||||
socialProfiles: this.extractSocialUrls(data),
|
|
||||||
openingHours: this.parseOpeningHours(data.openingHours),
|
|
||||||
description: data.description
|
|
||||||
});
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private static formatAddress(address: any): string | undefined {
|
|
||||||
if (typeof address === 'string') return address;
|
|
||||||
if (typeof address === 'object') {
|
|
||||||
const parts = [
|
|
||||||
address.streetAddress,
|
|
||||||
address.addressLocality,
|
|
||||||
address.addressRegion,
|
|
||||||
address.postalCode,
|
|
||||||
address.addressCountry
|
|
||||||
].filter(Boolean);
|
|
||||||
return parts.join(', ');
|
|
||||||
}
|
|
||||||
return undefined;
|
|
||||||
}
|
|
||||||
|
|
||||||
private static extractSocialUrls(data: any): string[] {
|
|
||||||
const urls: string[] = [];
|
|
||||||
if (data.sameAs) {
|
|
||||||
if (Array.isArray(data.sameAs)) {
|
|
||||||
urls.push(...data.sameAs);
|
|
||||||
} else if (typeof data.sameAs === 'string') {
|
|
||||||
urls.push(data.sameAs);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return urls;
|
|
||||||
}
|
|
||||||
|
|
||||||
private static parseOpeningHours(hours: any): Record<string, string> | undefined {
|
|
||||||
if (!hours) return undefined;
|
|
||||||
|
|
||||||
if (Array.isArray(hours)) {
|
|
||||||
const schedule: Record<string, string> = {};
|
|
||||||
hours.forEach(spec => {
|
|
||||||
const match = spec.match(/^(\w+)(-\w+)?\s+(\d\d:\d\d)-(\d\d:\d\d)$/);
|
|
||||||
if (match) {
|
|
||||||
schedule[match[1]] = `${match[3]}-${match[4]}`;
|
|
||||||
}
|
|
||||||
});
|
|
||||||
return schedule;
|
|
||||||
}
|
|
||||||
return undefined;
|
|
||||||
}
|
|
||||||
|
|
||||||
// ... helper methods for microdata and RDFa parsing ...
|
|
||||||
}
|
|
@@ -1,47 +0,0 @@
|
|||||||
import { Request, Response, NextFunction } from 'express';
|
|
||||||
import { supabase } from '../lib/supabase';
|
|
||||||
|
|
||||||
// Extend Express Request type to include user
|
|
||||||
declare global {
|
|
||||||
namespace Express {
|
|
||||||
interface Request {
|
|
||||||
user?: {
|
|
||||||
id: string;
|
|
||||||
email: string;
|
|
||||||
role: string;
|
|
||||||
};
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export async function authenticateUser(
|
|
||||||
req: Request,
|
|
||||||
res: Response,
|
|
||||||
next: NextFunction
|
|
||||||
) {
|
|
||||||
try {
|
|
||||||
const authHeader = req.headers.authorization;
|
|
||||||
if (!authHeader) {
|
|
||||||
return res.status(401).json({ error: 'No authorization header' });
|
|
||||||
}
|
|
||||||
|
|
||||||
const token = authHeader.replace('Bearer ', '');
|
|
||||||
const { data: { user }, error } = await supabase.auth.getUser(token);
|
|
||||||
|
|
||||||
if (error || !user) {
|
|
||||||
return res.status(401).json({ error: 'Invalid token' });
|
|
||||||
}
|
|
||||||
|
|
||||||
// Add user info to request
|
|
||||||
req.user = {
|
|
||||||
id: user.id,
|
|
||||||
email: user.email!,
|
|
||||||
role: (user.app_metadata?.role as string) || 'user'
|
|
||||||
};
|
|
||||||
|
|
||||||
next();
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Authentication error:', error);
|
|
||||||
res.status(401).json({ error: 'Authentication failed' });
|
|
||||||
}
|
|
||||||
}
|
|
@@ -1,148 +0,0 @@
|
|||||||
import express from 'express';
|
|
||||||
import { SearchService } from '../lib/services/searchService';
|
|
||||||
import { Business } from '../lib/types';
|
|
||||||
|
|
||||||
const router = express.Router();
|
|
||||||
const searchService = new SearchService();
|
|
||||||
|
|
||||||
// Error handling middleware for JSON parsing errors
|
|
||||||
router.use((err: Error, req: express.Request, res: express.Response, next: express.NextFunction) => {
|
|
||||||
if (err instanceof SyntaxError && 'body' in err) {
|
|
||||||
return res.status(400).json({
|
|
||||||
success: false,
|
|
||||||
error: 'Invalid JSON'
|
|
||||||
});
|
|
||||||
}
|
|
||||||
next();
|
|
||||||
});
|
|
||||||
|
|
||||||
// Business categories endpoint
|
|
||||||
router.get('/categories', (req, res) => {
|
|
||||||
const categories = [
|
|
||||||
'Restaurant',
|
|
||||||
'Retail',
|
|
||||||
'Service',
|
|
||||||
'Healthcare',
|
|
||||||
'Professional',
|
|
||||||
'Entertainment',
|
|
||||||
'Education',
|
|
||||||
'Technology',
|
|
||||||
'Manufacturing',
|
|
||||||
'Construction',
|
|
||||||
'Transportation',
|
|
||||||
'Real Estate',
|
|
||||||
'Financial',
|
|
||||||
'Legal',
|
|
||||||
'Other'
|
|
||||||
];
|
|
||||||
res.json(categories);
|
|
||||||
});
|
|
||||||
|
|
||||||
// Streaming search endpoint
|
|
||||||
router.post('/search/stream', (req, res) => {
|
|
||||||
const { query, location } = req.body;
|
|
||||||
|
|
||||||
if (!query || !location) {
|
|
||||||
return res.status(400).json({
|
|
||||||
success: false,
|
|
||||||
error: 'Query and location are required'
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
// Set headers for SSE
|
|
||||||
res.setHeader('Content-Type', 'text/event-stream');
|
|
||||||
res.setHeader('Cache-Control', 'no-cache');
|
|
||||||
res.setHeader('Connection', 'keep-alive');
|
|
||||||
|
|
||||||
// Send initial message
|
|
||||||
res.write('data: {"type":"start","message":"Starting search..."}\n\n');
|
|
||||||
|
|
||||||
// Create search service instance for this request
|
|
||||||
const search = new SearchService();
|
|
||||||
|
|
||||||
// Listen for individual results
|
|
||||||
search.on('result', (business: Business) => {
|
|
||||||
res.write(`data: {"type":"result","business":${JSON.stringify(business)}}\n\n`);
|
|
||||||
});
|
|
||||||
|
|
||||||
// Listen for progress updates
|
|
||||||
search.on('progress', (data: any) => {
|
|
||||||
res.write(`data: {"type":"progress","data":${JSON.stringify(data)}}\n\n`);
|
|
||||||
});
|
|
||||||
|
|
||||||
// Listen for completion
|
|
||||||
search.on('complete', () => {
|
|
||||||
res.write('data: {"type":"complete","message":"Search complete"}\n\n');
|
|
||||||
res.end();
|
|
||||||
});
|
|
||||||
|
|
||||||
// Listen for errors
|
|
||||||
search.on('error', (error: Error) => {
|
|
||||||
res.write(`data: {"type":"error","message":${JSON.stringify(error.message)}}\n\n`);
|
|
||||||
res.end();
|
|
||||||
});
|
|
||||||
|
|
||||||
// Start the search
|
|
||||||
search.streamSearch(query, location).catch(error => {
|
|
||||||
console.error('Search error:', error);
|
|
||||||
res.write(`data: {"type":"error","message":${JSON.stringify(error.message)}}\n\n`);
|
|
||||||
res.end();
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
// Regular search endpoint (non-streaming)
|
|
||||||
router.post('/search', async (req, res) => {
|
|
||||||
const { query, location } = req.body;
|
|
||||||
|
|
||||||
if (!query || !location) {
|
|
||||||
return res.status(400).json({
|
|
||||||
success: false,
|
|
||||||
error: 'Query and location are required'
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
try {
|
|
||||||
const results = await searchService.search(query, location);
|
|
||||||
res.json({
|
|
||||||
success: true,
|
|
||||||
results
|
|
||||||
});
|
|
||||||
} catch (error) {
|
|
||||||
const errorMessage = error instanceof Error ? error.message : 'An error occurred during search';
|
|
||||||
console.error('Search error:', error);
|
|
||||||
res.status(500).json({
|
|
||||||
success: false,
|
|
||||||
error: errorMessage
|
|
||||||
});
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Get business by ID
|
|
||||||
router.get('/business/:id', async (req, res) => {
|
|
||||||
const { id } = req.params;
|
|
||||||
|
|
||||||
try {
|
|
||||||
const business = await searchService.getBusinessById(id);
|
|
||||||
|
|
||||||
if (!business) {
|
|
||||||
return res.status(404).json({
|
|
||||||
success: false,
|
|
||||||
error: 'Business not found'
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
res.json({
|
|
||||||
success: true,
|
|
||||||
business
|
|
||||||
});
|
|
||||||
} catch (error) {
|
|
||||||
const errorMessage = error instanceof Error ? error.message : 'Failed to fetch business details';
|
|
||||||
console.error('Error fetching business:', error);
|
|
||||||
res.status(500).json({
|
|
||||||
success: false,
|
|
||||||
error: errorMessage
|
|
||||||
});
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
export default router;
|
|
@@ -1,413 +0,0 @@
|
|||||||
import { Router } from 'express';
|
|
||||||
import { z } from 'zod';
|
|
||||||
import { supabase } from '../lib/supabase';
|
|
||||||
import { authenticateUser } from '../middleware/auth';
|
|
||||||
|
|
||||||
const router = Router();
|
|
||||||
|
|
||||||
// Initialize database tables
|
|
||||||
async function initializeTables() {
|
|
||||||
try {
|
|
||||||
// Create businesses table if it doesn't exist
|
|
||||||
const { error: businessError } = await supabase.from('businesses').select('id').limit(1);
|
|
||||||
|
|
||||||
if (businessError?.code === 'PGRST204') {
|
|
||||||
const { error } = await supabase.rpc('execute_sql', {
|
|
||||||
sql_string: `
|
|
||||||
CREATE TABLE IF NOT EXISTS public.businesses (
|
|
||||||
id TEXT PRIMARY KEY,
|
|
||||||
name TEXT NOT NULL,
|
|
||||||
phone TEXT,
|
|
||||||
email TEXT,
|
|
||||||
address TEXT,
|
|
||||||
rating NUMERIC,
|
|
||||||
website TEXT,
|
|
||||||
description TEXT,
|
|
||||||
source TEXT,
|
|
||||||
logo TEXT,
|
|
||||||
latitude NUMERIC,
|
|
||||||
longitude NUMERIC,
|
|
||||||
last_updated TIMESTAMP WITH TIME ZONE DEFAULT timezone('utc'::text, now()),
|
|
||||||
search_count INTEGER DEFAULT 1,
|
|
||||||
created_at TIMESTAMP WITH TIME ZONE DEFAULT timezone('utc'::text, now()),
|
|
||||||
place_id TEXT
|
|
||||||
);
|
|
||||||
`
|
|
||||||
});
|
|
||||||
if (error) console.error('Error creating businesses table:', error);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Create business_profiles table if it doesn't exist
|
|
||||||
const { error: profileError } = await supabase.from('business_profiles').select('business_id').limit(1);
|
|
||||||
|
|
||||||
if (profileError?.code === 'PGRST204') {
|
|
||||||
const { error } = await supabase.rpc('execute_sql', {
|
|
||||||
sql_string: `
|
|
||||||
CREATE TABLE IF NOT EXISTS public.business_profiles (
|
|
||||||
business_id TEXT PRIMARY KEY REFERENCES public.businesses(id),
|
|
||||||
claimed_by UUID REFERENCES auth.users(id),
|
|
||||||
claimed_at TIMESTAMP WITH TIME ZONE,
|
|
||||||
verification_status TEXT NOT NULL DEFAULT 'unverified',
|
|
||||||
social_links JSONB DEFAULT '{}',
|
|
||||||
hours_of_operation JSONB DEFAULT '{}',
|
|
||||||
additional_photos TEXT[] DEFAULT '{}',
|
|
||||||
tags TEXT[] DEFAULT '{}',
|
|
||||||
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
|
||||||
CONSTRAINT valid_verification_status CHECK (verification_status IN ('unverified', 'pending', 'verified', 'rejected'))
|
|
||||||
);
|
|
||||||
|
|
||||||
CREATE TABLE IF NOT EXISTS public.business_claims (
|
|
||||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
||||||
business_id TEXT NOT NULL REFERENCES public.businesses(id),
|
|
||||||
user_id UUID NOT NULL REFERENCES auth.users(id),
|
|
||||||
status TEXT NOT NULL DEFAULT 'pending',
|
|
||||||
proof_documents TEXT[] DEFAULT '{}',
|
|
||||||
submitted_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
|
||||||
reviewed_at TIMESTAMP WITH TIME ZONE,
|
|
||||||
reviewed_by UUID REFERENCES auth.users(id),
|
|
||||||
notes TEXT,
|
|
||||||
CONSTRAINT valid_claim_status CHECK (status IN ('pending', 'approved', 'rejected'))
|
|
||||||
);
|
|
||||||
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_business_profiles_claimed_by ON public.business_profiles(claimed_by);
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_business_claims_business_id ON public.business_claims(business_id);
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_business_claims_user_id ON public.business_claims(user_id);
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_business_claims_status ON public.business_claims(status);
|
|
||||||
|
|
||||||
ALTER TABLE public.business_profiles ENABLE ROW LEVEL SECURITY;
|
|
||||||
ALTER TABLE public.business_claims ENABLE ROW LEVEL SECURITY;
|
|
||||||
|
|
||||||
DROP POLICY IF EXISTS "Public profiles are viewable by everyone" ON public.business_profiles;
|
|
||||||
CREATE POLICY "Public profiles are viewable by everyone"
|
|
||||||
ON public.business_profiles FOR SELECT
|
|
||||||
USING (true);
|
|
||||||
|
|
||||||
DROP POLICY IF EXISTS "Profiles can be updated by verified owners" ON public.business_profiles;
|
|
||||||
CREATE POLICY "Profiles can be updated by verified owners"
|
|
||||||
ON public.business_profiles FOR UPDATE
|
|
||||||
USING (auth.uid() = claimed_by AND verification_status = 'verified');
|
|
||||||
|
|
||||||
DROP POLICY IF EXISTS "Users can view their own claims" ON public.business_claims;
|
|
||||||
CREATE POLICY "Users can view their own claims"
|
|
||||||
ON public.business_claims FOR SELECT
|
|
||||||
USING (auth.uid() = user_id);
|
|
||||||
|
|
||||||
DROP POLICY IF EXISTS "Users can create claims" ON public.business_claims;
|
|
||||||
CREATE POLICY "Users can create claims"
|
|
||||||
ON public.business_claims FOR INSERT
|
|
||||||
WITH CHECK (auth.uid() = user_id);
|
|
||||||
|
|
||||||
DROP POLICY IF EXISTS "Only admins can review claims" ON public.business_claims;
|
|
||||||
CREATE POLICY "Only admins can review claims"
|
|
||||||
ON public.business_claims FOR UPDATE
|
|
||||||
USING (EXISTS (
|
|
||||||
SELECT 1 FROM auth.users
|
|
||||||
WHERE auth.uid() = id
|
|
||||||
AND raw_app_meta_data->>'role' = 'admin'
|
|
||||||
));
|
|
||||||
`
|
|
||||||
});
|
|
||||||
if (error) console.error('Error creating profile tables:', error);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Insert test data
|
|
||||||
const { error: testDataError } = await supabase
|
|
||||||
.from('businesses')
|
|
||||||
.insert([
|
|
||||||
{
|
|
||||||
id: 'test-business-1',
|
|
||||||
name: 'Test Coffee Shop',
|
|
||||||
phone: '303-555-0123',
|
|
||||||
email: 'contact@testcoffee.com',
|
|
||||||
address: '123 Test St, Denver, CO 80202',
|
|
||||||
rating: 4.5,
|
|
||||||
website: 'https://testcoffee.com',
|
|
||||||
description: 'A cozy coffee shop in downtown Denver serving artisanal coffee and pastries.',
|
|
||||||
source: 'manual'
|
|
||||||
}
|
|
||||||
])
|
|
||||||
.select()
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (testDataError) {
|
|
||||||
console.error('Error inserting test data:', testDataError);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Create test business profile
|
|
||||||
const { error: testProfileError } = await supabase
|
|
||||||
.from('business_profiles')
|
|
||||||
.insert([
|
|
||||||
{
|
|
||||||
business_id: 'test-business-1',
|
|
||||||
verification_status: 'unverified',
|
|
||||||
social_links: {
|
|
||||||
facebook: 'https://facebook.com/testcoffee',
|
|
||||||
instagram: 'https://instagram.com/testcoffee'
|
|
||||||
},
|
|
||||||
hours_of_operation: {
|
|
||||||
monday: ['7:00', '19:00'],
|
|
||||||
tuesday: ['7:00', '19:00'],
|
|
||||||
wednesday: ['7:00', '19:00'],
|
|
||||||
thursday: ['7:00', '19:00'],
|
|
||||||
friday: ['7:00', '20:00'],
|
|
||||||
saturday: ['8:00', '20:00'],
|
|
||||||
sunday: ['8:00', '18:00']
|
|
||||||
},
|
|
||||||
tags: ['coffee', 'pastries', 'breakfast', 'lunch']
|
|
||||||
}
|
|
||||||
])
|
|
||||||
.select()
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (testProfileError) {
|
|
||||||
console.error('Error creating test profile:', testProfileError);
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Error initializing tables:', error);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Call initialization on startup
|
|
||||||
initializeTables();
|
|
||||||
|
|
||||||
// Schema for business profile updates
|
|
||||||
const profileUpdateSchema = z.object({
|
|
||||||
social_links: z.record(z.string()).optional(),
|
|
||||||
hours_of_operation: z.record(z.array(z.string())).optional(),
|
|
||||||
additional_photos: z.array(z.string()).optional(),
|
|
||||||
tags: z.array(z.string()).optional(),
|
|
||||||
});
|
|
||||||
|
|
||||||
// Schema for claim submissions
|
|
||||||
const claimSubmissionSchema = z.object({
|
|
||||||
business_id: z.string(),
|
|
||||||
proof_documents: z.array(z.string()),
|
|
||||||
notes: z.string().optional(),
|
|
||||||
});
|
|
||||||
|
|
||||||
// Get business profile
|
|
||||||
router.get('/:businessId', async (req, res) => {
|
|
||||||
try {
|
|
||||||
const { businessId } = req.params;
|
|
||||||
|
|
||||||
// Get business details and profile
|
|
||||||
const { data: business, error: businessError } = await supabase
|
|
||||||
.from('businesses')
|
|
||||||
.select(`
|
|
||||||
*,
|
|
||||||
business_profiles (*)
|
|
||||||
`)
|
|
||||||
.eq('id', businessId)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (businessError) throw businessError;
|
|
||||||
if (!business) {
|
|
||||||
return res.status(404).json({ error: 'Business not found' });
|
|
||||||
}
|
|
||||||
|
|
||||||
res.json(business);
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Error fetching business profile:', error);
|
|
||||||
res.status(500).json({ error: 'Failed to fetch business profile' });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Update business profile (requires authentication)
|
|
||||||
router.patch('/:businessId/profile', authenticateUser, async (req, res) => {
|
|
||||||
try {
|
|
||||||
const { businessId } = req.params;
|
|
||||||
if (!req.user) {
|
|
||||||
return res.status(401).json({ error: 'User not authenticated' });
|
|
||||||
}
|
|
||||||
const userId = req.user.id;
|
|
||||||
const updates = profileUpdateSchema.parse(req.body);
|
|
||||||
|
|
||||||
// Check if user owns this profile
|
|
||||||
const { data: profile } = await supabase
|
|
||||||
.from('business_profiles')
|
|
||||||
.select('claimed_by, verification_status')
|
|
||||||
.eq('business_id', businessId)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (!profile || profile.claimed_by !== userId || profile.verification_status !== 'verified') {
|
|
||||||
return res.status(403).json({ error: 'Not authorized to update this profile' });
|
|
||||||
}
|
|
||||||
|
|
||||||
// Update profile
|
|
||||||
const { error: updateError } = await supabase
|
|
||||||
.from('business_profiles')
|
|
||||||
.update({
|
|
||||||
...updates,
|
|
||||||
updated_at: new Date().toISOString(),
|
|
||||||
})
|
|
||||||
.eq('business_id', businessId);
|
|
||||||
|
|
||||||
if (updateError) throw updateError;
|
|
||||||
|
|
||||||
res.json({ message: 'Profile updated successfully' });
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Error updating business profile:', error);
|
|
||||||
res.status(500).json({ error: 'Failed to update profile' });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Submit a claim for a business
|
|
||||||
router.post('/claim', authenticateUser, async (req, res) => {
|
|
||||||
try {
|
|
||||||
if (!req.user) {
|
|
||||||
return res.status(401).json({ error: 'User not authenticated' });
|
|
||||||
}
|
|
||||||
const userId = req.user.id;
|
|
||||||
const claim = claimSubmissionSchema.parse(req.body);
|
|
||||||
|
|
||||||
// Check if business exists
|
|
||||||
const { data: business } = await supabase
|
|
||||||
.from('businesses')
|
|
||||||
.select('id')
|
|
||||||
.eq('id', claim.business_id)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (!business) {
|
|
||||||
return res.status(404).json({ error: 'Business not found' });
|
|
||||||
}
|
|
||||||
|
|
||||||
// Check if business is already claimed
|
|
||||||
const { data: existingProfile } = await supabase
|
|
||||||
.from('business_profiles')
|
|
||||||
.select('claimed_by')
|
|
||||||
.eq('business_id', claim.business_id)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (existingProfile?.claimed_by) {
|
|
||||||
return res.status(400).json({ error: 'Business is already claimed' });
|
|
||||||
}
|
|
||||||
|
|
||||||
// Check for existing pending claims
|
|
||||||
const { data: existingClaim } = await supabase
|
|
||||||
.from('business_claims')
|
|
||||||
.select('id')
|
|
||||||
.eq('business_id', claim.business_id)
|
|
||||||
.eq('status', 'pending')
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (existingClaim) {
|
|
||||||
return res.status(400).json({ error: 'A pending claim already exists for this business' });
|
|
||||||
}
|
|
||||||
|
|
||||||
// Create claim
|
|
||||||
const { error: claimError } = await supabase
|
|
||||||
.from('business_claims')
|
|
||||||
.insert({
|
|
||||||
business_id: claim.business_id,
|
|
||||||
user_id: userId,
|
|
||||||
proof_documents: claim.proof_documents,
|
|
||||||
notes: claim.notes,
|
|
||||||
});
|
|
||||||
|
|
||||||
if (claimError) throw claimError;
|
|
||||||
|
|
||||||
res.json({ message: 'Claim submitted successfully' });
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Error submitting business claim:', error);
|
|
||||||
res.status(500).json({ error: 'Failed to submit claim' });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Get claims for a business (admin only)
|
|
||||||
router.get('/:businessId/claims', authenticateUser, async (req, res) => {
|
|
||||||
try {
|
|
||||||
const { businessId } = req.params;
|
|
||||||
if (!req.user) {
|
|
||||||
return res.status(401).json({ error: 'User not authenticated' });
|
|
||||||
}
|
|
||||||
const userId = req.user.id;
|
|
||||||
|
|
||||||
// Check if user is admin
|
|
||||||
const { data: user } = await supabase
|
|
||||||
.from('users')
|
|
||||||
.select('raw_app_meta_data')
|
|
||||||
.eq('id', userId)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (user?.raw_app_meta_data?.role !== 'admin') {
|
|
||||||
return res.status(403).json({ error: 'Not authorized' });
|
|
||||||
}
|
|
||||||
|
|
||||||
const { data: claims, error } = await supabase
|
|
||||||
.from('business_claims')
|
|
||||||
.select(`
|
|
||||||
*,
|
|
||||||
user:user_id (
|
|
||||||
email
|
|
||||||
)
|
|
||||||
`)
|
|
||||||
.eq('business_id', businessId)
|
|
||||||
.order('submitted_at', { ascending: false });
|
|
||||||
|
|
||||||
if (error) throw error;
|
|
||||||
|
|
||||||
res.json(claims);
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Error fetching business claims:', error);
|
|
||||||
res.status(500).json({ error: 'Failed to fetch claims' });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Review a claim (admin only)
|
|
||||||
router.post('/claims/:claimId/review', authenticateUser, async (req, res) => {
|
|
||||||
try {
|
|
||||||
const { claimId } = req.params;
|
|
||||||
if (!req.user) {
|
|
||||||
return res.status(401).json({ error: 'User not authenticated' });
|
|
||||||
}
|
|
||||||
const userId = req.user.id;
|
|
||||||
const { status, notes } = z.object({
|
|
||||||
status: z.enum(['approved', 'rejected']),
|
|
||||||
notes: z.string().optional(),
|
|
||||||
}).parse(req.body);
|
|
||||||
|
|
||||||
// Check if user is admin
|
|
||||||
const { data: user } = await supabase
|
|
||||||
.from('users')
|
|
||||||
.select('raw_app_meta_data')
|
|
||||||
.eq('id', userId)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (user?.raw_app_meta_data?.role !== 'admin') {
|
|
||||||
return res.status(403).json({ error: 'Not authorized' });
|
|
||||||
}
|
|
||||||
|
|
||||||
// Get claim details
|
|
||||||
const { data: claim } = await supabase
|
|
||||||
.from('business_claims')
|
|
||||||
.select('business_id, status')
|
|
||||||
.eq('id', claimId)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (!claim) {
|
|
||||||
return res.status(404).json({ error: 'Claim not found' });
|
|
||||||
}
|
|
||||||
|
|
||||||
if (claim.status !== 'pending') {
|
|
||||||
return res.status(400).json({ error: 'Claim has already been reviewed' });
|
|
||||||
}
|
|
||||||
|
|
||||||
// Start a transaction
|
|
||||||
const { error: updateError } = await supabase.rpc('review_business_claim', {
|
|
||||||
p_claim_id: claimId,
|
|
||||||
p_business_id: claim.business_id,
|
|
||||||
p_user_id: userId,
|
|
||||||
p_status: status,
|
|
||||||
p_notes: notes
|
|
||||||
});
|
|
||||||
|
|
||||||
if (updateError) throw updateError;
|
|
||||||
|
|
||||||
res.json({ message: 'Claim reviewed successfully' });
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Error reviewing business claim:', error);
|
|
||||||
res.status(500).json({ error: 'Failed to review claim' });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
export default router;
|
|
@@ -10,6 +10,19 @@ import {
|
|||||||
getGeminiApiKey,
|
getGeminiApiKey,
|
||||||
getOpenaiApiKey,
|
getOpenaiApiKey,
|
||||||
updateConfig,
|
updateConfig,
|
||||||
|
getCustomOpenaiApiUrl,
|
||||||
|
getCustomOpenaiApiKey,
|
||||||
|
getCustomOpenaiModelName,
|
||||||
|
getSearchEngineBackend,
|
||||||
|
getImageSearchEngineBackend,
|
||||||
|
getVideoSearchEngineBackend,
|
||||||
|
getNewsSearchEngineBackend,
|
||||||
|
getSearxngApiEndpoint,
|
||||||
|
getGoogleApiKey,
|
||||||
|
getGoogleCseId,
|
||||||
|
getBingSubscriptionKey,
|
||||||
|
getBraveApiKey,
|
||||||
|
getYacyJsonEndpoint,
|
||||||
} from '../config';
|
} from '../config';
|
||||||
import logger from '../utils/logger';
|
import logger from '../utils/logger';
|
||||||
|
|
||||||
@@ -54,6 +67,24 @@ router.get('/', async (_, res) => {
|
|||||||
config['anthropicApiKey'] = getAnthropicApiKey();
|
config['anthropicApiKey'] = getAnthropicApiKey();
|
||||||
config['groqApiKey'] = getGroqApiKey();
|
config['groqApiKey'] = getGroqApiKey();
|
||||||
config['geminiApiKey'] = getGeminiApiKey();
|
config['geminiApiKey'] = getGeminiApiKey();
|
||||||
|
config['customOpenaiApiUrl'] = getCustomOpenaiApiUrl();
|
||||||
|
config['customOpenaiApiKey'] = getCustomOpenaiApiKey();
|
||||||
|
config['customOpenaiModelName'] = getCustomOpenaiModelName();
|
||||||
|
|
||||||
|
// Add search engine configuration
|
||||||
|
config['searchEngineBackends'] = {
|
||||||
|
search: getSearchEngineBackend(),
|
||||||
|
image: getImageSearchEngineBackend(),
|
||||||
|
video: getVideoSearchEngineBackend(),
|
||||||
|
news: getNewsSearchEngineBackend(),
|
||||||
|
};
|
||||||
|
|
||||||
|
config['searxngEndpoint'] = getSearxngApiEndpoint();
|
||||||
|
config['googleApiKey'] = getGoogleApiKey();
|
||||||
|
config['googleCseId'] = getGoogleCseId();
|
||||||
|
config['bingSubscriptionKey'] = getBingSubscriptionKey();
|
||||||
|
config['braveApiKey'] = getBraveApiKey();
|
||||||
|
config['yacyEndpoint'] = getYacyJsonEndpoint();
|
||||||
|
|
||||||
res.status(200).json(config);
|
res.status(200).json(config);
|
||||||
} catch (err: any) {
|
} catch (err: any) {
|
||||||
@@ -66,14 +97,51 @@ router.post('/', async (req, res) => {
|
|||||||
const config = req.body;
|
const config = req.body;
|
||||||
|
|
||||||
const updatedConfig = {
|
const updatedConfig = {
|
||||||
API_KEYS: {
|
MODELS: {
|
||||||
OPENAI: config.openaiApiKey,
|
OPENAI: {
|
||||||
GROQ: config.groqApiKey,
|
API_KEY: config.openaiApiKey,
|
||||||
ANTHROPIC: config.anthropicApiKey,
|
},
|
||||||
GEMINI: config.geminiApiKey,
|
GROQ: {
|
||||||
|
API_KEY: config.groqApiKey,
|
||||||
|
},
|
||||||
|
ANTHROPIC: {
|
||||||
|
API_KEY: config.anthropicApiKey,
|
||||||
|
},
|
||||||
|
GEMINI: {
|
||||||
|
API_KEY: config.geminiApiKey,
|
||||||
|
},
|
||||||
|
OLLAMA: {
|
||||||
|
API_URL: config.ollamaApiUrl,
|
||||||
|
},
|
||||||
|
CUSTOM_OPENAI: {
|
||||||
|
API_URL: config.customOpenaiApiUrl,
|
||||||
|
API_KEY: config.customOpenaiApiKey,
|
||||||
|
MODEL_NAME: config.customOpenaiModelName,
|
||||||
|
},
|
||||||
},
|
},
|
||||||
API_ENDPOINTS: {
|
SEARCH_ENGINE_BACKENDS: config.searchEngineBackends ? {
|
||||||
OLLAMA: config.ollamaApiUrl,
|
SEARCH: config.searchEngineBackends.search,
|
||||||
|
IMAGE: config.searchEngineBackends.image,
|
||||||
|
VIDEO: config.searchEngineBackends.video,
|
||||||
|
NEWS: config.searchEngineBackends.news,
|
||||||
|
} : undefined,
|
||||||
|
SEARCH_ENGINES: {
|
||||||
|
GOOGLE: {
|
||||||
|
API_KEY: config.googleApiKey,
|
||||||
|
CSE_ID: config.googleCseId,
|
||||||
|
},
|
||||||
|
SEARXNG: {
|
||||||
|
ENDPOINT: config.searxngEndpoint,
|
||||||
|
},
|
||||||
|
BING: {
|
||||||
|
SUBSCRIPTION_KEY: config.bingSubscriptionKey,
|
||||||
|
},
|
||||||
|
BRAVE: {
|
||||||
|
API_KEY: config.braveApiKey,
|
||||||
|
},
|
||||||
|
YACY: {
|
||||||
|
ENDPOINT: config.yacyEndpoint,
|
||||||
|
},
|
||||||
},
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
|
@@ -1,42 +1,125 @@
|
|||||||
import express from 'express';
|
import express from 'express';
|
||||||
import { searchSearxng } from '../lib/searxng';
|
import { searchSearxng } from '../lib/searchEngines/searxng';
|
||||||
|
import { searchGooglePSE } from '../lib/searchEngines/google_pse';
|
||||||
|
import { searchBraveAPI } from '../lib/searchEngines/brave';
|
||||||
|
import { searchYaCy } from '../lib/searchEngines/yacy';
|
||||||
|
import { searchBingAPI } from '../lib/searchEngines/bing';
|
||||||
|
import { getNewsSearchEngineBackend } from '../config';
|
||||||
import logger from '../utils/logger';
|
import logger from '../utils/logger';
|
||||||
|
|
||||||
const router = express.Router();
|
const router = express.Router();
|
||||||
|
|
||||||
|
async function performSearch(query: string, site: string) {
|
||||||
|
const searchEngine = getNewsSearchEngineBackend();
|
||||||
|
switch (searchEngine) {
|
||||||
|
case 'google': {
|
||||||
|
const googleResult = await searchGooglePSE(query);
|
||||||
|
|
||||||
|
return googleResult.originalres.map((item) => {
|
||||||
|
const imageSources = [
|
||||||
|
item.pagemap?.cse_image?.[0]?.src,
|
||||||
|
item.pagemap?.cse_thumbnail?.[0]?.src,
|
||||||
|
item.pagemap?.metatags?.[0]?.['og:image'],
|
||||||
|
item.pagemap?.metatags?.[0]?.['twitter:image'],
|
||||||
|
item.pagemap?.metatags?.[0]?.['image'],
|
||||||
|
].filter(Boolean); // Remove undefined values
|
||||||
|
|
||||||
|
return {
|
||||||
|
title: item.title,
|
||||||
|
url: item.link,
|
||||||
|
content: item.snippet,
|
||||||
|
thumbnail: imageSources[0], // First available image
|
||||||
|
img_src: imageSources[0], // Same as thumbnail for consistency
|
||||||
|
iframe_src: null,
|
||||||
|
author: item.pagemap?.metatags?.[0]?.['og:site_name'] || site,
|
||||||
|
publishedDate:
|
||||||
|
item.pagemap?.metatags?.[0]?.['article:published_time'],
|
||||||
|
};
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'searxng': {
|
||||||
|
const searxResult = await searchSearxng(query, {
|
||||||
|
engines: ['bing news'],
|
||||||
|
pageno: 1,
|
||||||
|
});
|
||||||
|
return searxResult.results;
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'brave': {
|
||||||
|
const braveResult = await searchBraveAPI(query);
|
||||||
|
return braveResult.results.map((item) => ({
|
||||||
|
title: item.title,
|
||||||
|
url: item.url,
|
||||||
|
content: item.content,
|
||||||
|
thumbnail: item.img_src,
|
||||||
|
img_src: item.img_src,
|
||||||
|
iframe_src: null,
|
||||||
|
author: item.meta?.fetched || site,
|
||||||
|
publishedDate: item.meta?.lastCrawled,
|
||||||
|
}));
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'yacy': {
|
||||||
|
const yacyResult = await searchYaCy(query);
|
||||||
|
return yacyResult.results.map((item) => ({
|
||||||
|
title: item.title,
|
||||||
|
url: item.url,
|
||||||
|
content: item.content,
|
||||||
|
thumbnail: item.img_src,
|
||||||
|
img_src: item.img_src,
|
||||||
|
iframe_src: null,
|
||||||
|
author: item?.host || site,
|
||||||
|
publishedDate: item?.pubDate,
|
||||||
|
}));
|
||||||
|
}
|
||||||
|
|
||||||
|
case 'bing': {
|
||||||
|
const bingResult = await searchBingAPI(query);
|
||||||
|
return bingResult.results.map((item) => ({
|
||||||
|
title: item.title,
|
||||||
|
url: item.url,
|
||||||
|
content: item.content,
|
||||||
|
thumbnail: item.img_src,
|
||||||
|
img_src: item.img_src,
|
||||||
|
iframe_src: null,
|
||||||
|
author: item?.publisher || site,
|
||||||
|
publishedDate: item?.datePublished,
|
||||||
|
}));
|
||||||
|
}
|
||||||
|
|
||||||
|
default:
|
||||||
|
throw new Error(`Unknown search engine ${searchEngine}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
router.get('/', async (req, res) => {
|
router.get('/', async (req, res) => {
|
||||||
try {
|
try {
|
||||||
|
const queries = [
|
||||||
|
{ site: 'businessinsider.com', topic: 'AI' },
|
||||||
|
{ site: 'www.exchangewire.com', topic: 'AI' },
|
||||||
|
{ site: 'yahoo.com', topic: 'AI' },
|
||||||
|
{ site: 'businessinsider.com', topic: 'tech' },
|
||||||
|
{ site: 'www.exchangewire.com', topic: 'tech' },
|
||||||
|
{ site: 'yahoo.com', topic: 'tech' },
|
||||||
|
];
|
||||||
|
|
||||||
const data = (
|
const data = (
|
||||||
await Promise.all([
|
await Promise.all(
|
||||||
searchSearxng('site:businessinsider.com AI', {
|
queries.map(async ({ site, topic }) => {
|
||||||
engines: ['bing news'],
|
try {
|
||||||
pageno: 1,
|
const query = `site:${site} ${topic}`;
|
||||||
|
return await performSearch(query, site);
|
||||||
|
} catch (error) {
|
||||||
|
logger.error(`Error searching ${site}: ${error.message}`);
|
||||||
|
return [];
|
||||||
|
}
|
||||||
}),
|
}),
|
||||||
searchSearxng('site:www.exchangewire.com AI', {
|
)
|
||||||
engines: ['bing news'],
|
|
||||||
pageno: 1,
|
|
||||||
}),
|
|
||||||
searchSearxng('site:yahoo.com AI', {
|
|
||||||
engines: ['bing news'],
|
|
||||||
pageno: 1,
|
|
||||||
}),
|
|
||||||
searchSearxng('site:businessinsider.com tech', {
|
|
||||||
engines: ['bing news'],
|
|
||||||
pageno: 1,
|
|
||||||
}),
|
|
||||||
searchSearxng('site:www.exchangewire.com tech', {
|
|
||||||
engines: ['bing news'],
|
|
||||||
pageno: 1,
|
|
||||||
}),
|
|
||||||
searchSearxng('site:yahoo.com tech', {
|
|
||||||
engines: ['bing news'],
|
|
||||||
pageno: 1,
|
|
||||||
}),
|
|
||||||
])
|
|
||||||
)
|
)
|
||||||
.map((result) => result.results)
|
|
||||||
.flat()
|
.flat()
|
||||||
.sort(() => Math.random() - 0.5);
|
.sort(() => Math.random() - 0.5)
|
||||||
|
.filter((item) => item.title && item.url && item.content);
|
||||||
|
|
||||||
return res.json({ blogs: data });
|
return res.json({ blogs: data });
|
||||||
} catch (err: any) {
|
} catch (err: any) {
|
||||||
|
@@ -5,14 +5,17 @@ import { getAvailableChatModelProviders } from '../lib/providers';
|
|||||||
import { HumanMessage, AIMessage } from '@langchain/core/messages';
|
import { HumanMessage, AIMessage } from '@langchain/core/messages';
|
||||||
import logger from '../utils/logger';
|
import logger from '../utils/logger';
|
||||||
import { ChatOpenAI } from '@langchain/openai';
|
import { ChatOpenAI } from '@langchain/openai';
|
||||||
|
import {
|
||||||
|
getCustomOpenaiApiKey,
|
||||||
|
getCustomOpenaiApiUrl,
|
||||||
|
getCustomOpenaiModelName,
|
||||||
|
} from '../config';
|
||||||
|
|
||||||
const router = express.Router();
|
const router = express.Router();
|
||||||
|
|
||||||
interface ChatModel {
|
interface ChatModel {
|
||||||
provider: string;
|
provider: string;
|
||||||
model: string;
|
model: string;
|
||||||
customOpenAIBaseURL?: string;
|
|
||||||
customOpenAIKey?: string;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
interface ImageSearchBody {
|
interface ImageSearchBody {
|
||||||
@@ -44,21 +47,12 @@ router.post('/', async (req, res) => {
|
|||||||
let llm: BaseChatModel | undefined;
|
let llm: BaseChatModel | undefined;
|
||||||
|
|
||||||
if (body.chatModel?.provider === 'custom_openai') {
|
if (body.chatModel?.provider === 'custom_openai') {
|
||||||
if (
|
|
||||||
!body.chatModel?.customOpenAIBaseURL ||
|
|
||||||
!body.chatModel?.customOpenAIKey
|
|
||||||
) {
|
|
||||||
return res
|
|
||||||
.status(400)
|
|
||||||
.json({ message: 'Missing custom OpenAI base URL or key' });
|
|
||||||
}
|
|
||||||
|
|
||||||
llm = new ChatOpenAI({
|
llm = new ChatOpenAI({
|
||||||
modelName: body.chatModel.model,
|
modelName: getCustomOpenaiModelName(),
|
||||||
openAIApiKey: body.chatModel.customOpenAIKey,
|
openAIApiKey: getCustomOpenaiApiKey(),
|
||||||
temperature: 0.7,
|
temperature: 0.7,
|
||||||
configuration: {
|
configuration: {
|
||||||
baseURL: body.chatModel.customOpenAIBaseURL,
|
baseURL: getCustomOpenaiApiUrl(),
|
||||||
},
|
},
|
||||||
}) as unknown as BaseChatModel;
|
}) as unknown as BaseChatModel;
|
||||||
} else if (
|
} else if (
|
||||||
|
@@ -1,310 +1,158 @@
|
|||||||
import { Router, Response as ExpressResponse } from 'express';
|
import express from 'express';
|
||||||
import { z } from 'zod';
|
import logger from '../utils/logger';
|
||||||
import fetch from 'node-fetch';
|
import type { BaseChatModel } from '@langchain/core/language_models/chat_models';
|
||||||
import { Response as FetchResponse } from 'node-fetch';
|
import type { Embeddings } from '@langchain/core/embeddings';
|
||||||
import { supabase } from '../lib/supabase';
|
import { ChatOpenAI } from '@langchain/openai';
|
||||||
import { env } from '../config/env';
|
import {
|
||||||
|
getAvailableChatModelProviders,
|
||||||
|
getAvailableEmbeddingModelProviders,
|
||||||
|
} from '../lib/providers';
|
||||||
|
import { searchHandlers } from '../websocket/messageHandler';
|
||||||
|
import { AIMessage, BaseMessage, HumanMessage } from '@langchain/core/messages';
|
||||||
|
import { MetaSearchAgentType } from '../search/metaSearchAgent';
|
||||||
|
import {
|
||||||
|
getCustomOpenaiApiKey,
|
||||||
|
getCustomOpenaiApiUrl,
|
||||||
|
getCustomOpenaiModelName,
|
||||||
|
} from '../config';
|
||||||
|
|
||||||
const router = Router();
|
const router = express.Router();
|
||||||
|
|
||||||
const searchSchema = z.object({
|
interface chatModel {
|
||||||
query: z.string().min(1),
|
provider: string;
|
||||||
});
|
model: string;
|
||||||
|
customOpenAIKey?: string;
|
||||||
interface Business {
|
customOpenAIBaseURL?: string;
|
||||||
id: string;
|
|
||||||
name: string;
|
|
||||||
description: string;
|
|
||||||
website: string;
|
|
||||||
phone: string | null;
|
|
||||||
address: string | null;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
interface SearxResult {
|
interface embeddingModel {
|
||||||
url: string;
|
provider: string;
|
||||||
title: string;
|
model: string;
|
||||||
content: string;
|
|
||||||
engine: string;
|
|
||||||
score: number;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
interface SearxResponse {
|
interface ChatRequestBody {
|
||||||
|
optimizationMode: 'speed' | 'balanced';
|
||||||
|
focusMode: string;
|
||||||
|
chatModel?: chatModel;
|
||||||
|
embeddingModel?: embeddingModel;
|
||||||
query: string;
|
query: string;
|
||||||
results: SearxResult[];
|
history: Array<[string, string]>;
|
||||||
}
|
}
|
||||||
|
|
||||||
async function getCachedResults(query: string): Promise<Business[]> {
|
router.post('/', async (req, res) => {
|
||||||
console.log('Fetching cached results for query:', query);
|
|
||||||
const normalizedQuery = query.toLowerCase()
|
|
||||||
.trim()
|
|
||||||
.replace(/,/g, '') // Remove commas
|
|
||||||
.replace(/\s+/g, ' '); // Normalize whitespace
|
|
||||||
|
|
||||||
const searchTerms = normalizedQuery.split(' ').filter(term => term.length > 0);
|
|
||||||
console.log('Normalized search terms:', searchTerms);
|
|
||||||
|
|
||||||
// First try exact match
|
|
||||||
const { data: exactMatch } = await supabase
|
|
||||||
.from('search_cache')
|
|
||||||
.select('*')
|
|
||||||
.eq('query', normalizedQuery)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
if (exactMatch) {
|
|
||||||
console.log('Found exact match in cache');
|
|
||||||
return exactMatch.results as Business[];
|
|
||||||
}
|
|
||||||
|
|
||||||
// Then try fuzzy search
|
|
||||||
console.log('Trying fuzzy search with terms:', searchTerms);
|
|
||||||
const searchConditions = searchTerms.map(term => `query.ilike.%${term}%`);
|
|
||||||
const { data: cachedResults, error } = await supabase
|
|
||||||
.from('search_cache')
|
|
||||||
.select('*')
|
|
||||||
.or(searchConditions.join(','));
|
|
||||||
|
|
||||||
if (error) {
|
|
||||||
console.error('Error fetching cached results:', error);
|
|
||||||
return [];
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!cachedResults || cachedResults.length === 0) {
|
|
||||||
console.log('No cached results found');
|
|
||||||
return [];
|
|
||||||
}
|
|
||||||
|
|
||||||
console.log(`Found ${cachedResults.length} cached searches`);
|
|
||||||
|
|
||||||
// Combine and deduplicate results from all matching searches
|
|
||||||
const allResults = cachedResults.flatMap(cache => cache.results as Business[]);
|
|
||||||
const uniqueResults = Array.from(new Map(allResults.map(item => [item.id, item])).values());
|
|
||||||
|
|
||||||
console.log(`Combined into ${uniqueResults.length} unique businesses`);
|
|
||||||
|
|
||||||
// Sort by relevance to search terms
|
|
||||||
const sortedResults = uniqueResults.sort((a, b) => {
|
|
||||||
const aScore = searchTerms.filter(term =>
|
|
||||||
a.name.toLowerCase().includes(term) ||
|
|
||||||
a.description.toLowerCase().includes(term)
|
|
||||||
).length;
|
|
||||||
const bScore = searchTerms.filter(term =>
|
|
||||||
b.name.toLowerCase().includes(term) ||
|
|
||||||
b.description.toLowerCase().includes(term)
|
|
||||||
).length;
|
|
||||||
return bScore - aScore;
|
|
||||||
});
|
|
||||||
|
|
||||||
return sortedResults;
|
|
||||||
}
|
|
||||||
|
|
||||||
async function searchSearxNG(query: string): Promise<Business[]> {
|
|
||||||
console.log('Starting SearxNG search for query:', query);
|
|
||||||
try {
|
try {
|
||||||
const params = new URLSearchParams({
|
const body: ChatRequestBody = req.body;
|
||||||
q: `${query} denver business`,
|
|
||||||
format: 'json',
|
|
||||||
language: 'en',
|
|
||||||
time_range: '',
|
|
||||||
safesearch: '1',
|
|
||||||
engines: 'google,bing,duckduckgo'
|
|
||||||
});
|
|
||||||
|
|
||||||
const searchUrl = `${env.SEARXNG_URL}/search?${params.toString()}`;
|
if (!body.focusMode || !body.query) {
|
||||||
console.log('Searching SearxNG at URL:', searchUrl);
|
return res.status(400).json({ message: 'Missing focus mode or query' });
|
||||||
|
}
|
||||||
|
|
||||||
const response: FetchResponse = await fetch(searchUrl, {
|
body.history = body.history || [];
|
||||||
method: 'GET',
|
body.optimizationMode = body.optimizationMode || 'balanced';
|
||||||
headers: {
|
|
||||||
'Accept': 'application/json',
|
const history: BaseMessage[] = body.history.map((msg) => {
|
||||||
|
if (msg[0] === 'human') {
|
||||||
|
return new HumanMessage({
|
||||||
|
content: msg[1],
|
||||||
|
});
|
||||||
|
} else {
|
||||||
|
return new AIMessage({
|
||||||
|
content: msg[1],
|
||||||
|
});
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
if (!response.ok) {
|
const [chatModelProviders, embeddingModelProviders] = await Promise.all([
|
||||||
throw new Error(`SearxNG search failed: ${response.statusText} (${response.status})`);
|
getAvailableChatModelProviders(),
|
||||||
|
getAvailableEmbeddingModelProviders(),
|
||||||
|
]);
|
||||||
|
|
||||||
|
const chatModelProvider =
|
||||||
|
body.chatModel?.provider || Object.keys(chatModelProviders)[0];
|
||||||
|
const chatModel =
|
||||||
|
body.chatModel?.model ||
|
||||||
|
Object.keys(chatModelProviders[chatModelProvider])[0];
|
||||||
|
|
||||||
|
const embeddingModelProvider =
|
||||||
|
body.embeddingModel?.provider || Object.keys(embeddingModelProviders)[0];
|
||||||
|
const embeddingModel =
|
||||||
|
body.embeddingModel?.model ||
|
||||||
|
Object.keys(embeddingModelProviders[embeddingModelProvider])[0];
|
||||||
|
|
||||||
|
let llm: BaseChatModel | undefined;
|
||||||
|
let embeddings: Embeddings | undefined;
|
||||||
|
|
||||||
|
if (body.chatModel?.provider === 'custom_openai') {
|
||||||
|
llm = new ChatOpenAI({
|
||||||
|
modelName: body.chatModel?.model || getCustomOpenaiModelName(),
|
||||||
|
openAIApiKey:
|
||||||
|
body.chatModel?.customOpenAIKey || getCustomOpenaiApiKey(),
|
||||||
|
temperature: 0.7,
|
||||||
|
configuration: {
|
||||||
|
baseURL:
|
||||||
|
body.chatModel?.customOpenAIBaseURL || getCustomOpenaiApiUrl(),
|
||||||
|
},
|
||||||
|
}) as unknown as BaseChatModel;
|
||||||
|
} else if (
|
||||||
|
chatModelProviders[chatModelProvider] &&
|
||||||
|
chatModelProviders[chatModelProvider][chatModel]
|
||||||
|
) {
|
||||||
|
llm = chatModelProviders[chatModelProvider][chatModel]
|
||||||
|
.model as unknown as BaseChatModel | undefined;
|
||||||
}
|
}
|
||||||
|
|
||||||
const data = await response.json() as SearxResponse;
|
if (
|
||||||
console.log(`Got ${data.results?.length || 0} raw results from SearxNG`);
|
embeddingModelProviders[embeddingModelProvider] &&
|
||||||
console.log('Sample result:', data.results?.[0]);
|
embeddingModelProviders[embeddingModelProvider][embeddingModel]
|
||||||
|
) {
|
||||||
if (!data.results || data.results.length === 0) {
|
embeddings = embeddingModelProviders[embeddingModelProvider][
|
||||||
return [];
|
embeddingModel
|
||||||
|
].model as Embeddings | undefined;
|
||||||
}
|
}
|
||||||
|
|
||||||
const filteredResults = data.results
|
if (!llm || !embeddings) {
|
||||||
.filter(result =>
|
return res.status(400).json({ message: 'Invalid model selected' });
|
||||||
result.title &&
|
}
|
||||||
result.url &&
|
|
||||||
!result.url.includes('yelp.com/search') &&
|
|
||||||
!result.url.includes('google.com/search') &&
|
|
||||||
!result.url.includes('bbb.org/search') &&
|
|
||||||
!result.url.includes('thumbtack.com/search') &&
|
|
||||||
!result.url.includes('angi.com/search') &&
|
|
||||||
!result.url.includes('yellowpages.com/search')
|
|
||||||
);
|
|
||||||
|
|
||||||
console.log(`Filtered to ${filteredResults.length} relevant results`);
|
const searchHandler: MetaSearchAgentType = searchHandlers[body.focusMode];
|
||||||
console.log('Sample filtered result:', filteredResults[0]);
|
|
||||||
|
|
||||||
const searchTerms = query.toLowerCase().split(' ');
|
if (!searchHandler) {
|
||||||
const businesses = filteredResults
|
return res.status(400).json({ message: 'Invalid focus mode' });
|
||||||
.map(result => {
|
}
|
||||||
const business = {
|
|
||||||
id: result.url,
|
|
||||||
name: cleanBusinessName(result.title),
|
|
||||||
description: result.content || '',
|
|
||||||
website: result.url,
|
|
||||||
phone: extractPhone(result.content || '') || extractPhone(result.title),
|
|
||||||
address: extractAddress(result.content || '') || extractAddress(result.title),
|
|
||||||
score: result.score || 0
|
|
||||||
};
|
|
||||||
console.log('Processed business:', business);
|
|
||||||
return business;
|
|
||||||
})
|
|
||||||
.filter(business => {
|
|
||||||
// Check if business name contains any of the search terms
|
|
||||||
const nameMatches = searchTerms.some(term =>
|
|
||||||
business.name.toLowerCase().includes(term)
|
|
||||||
);
|
|
||||||
|
|
||||||
// Check if description contains any of the search terms
|
|
||||||
const descriptionMatches = searchTerms.some(term =>
|
|
||||||
business.description.toLowerCase().includes(term)
|
|
||||||
);
|
|
||||||
|
|
||||||
return business.name.length > 2 && (nameMatches || descriptionMatches);
|
|
||||||
})
|
|
||||||
.sort((a, b) => {
|
|
||||||
// Score based on how many search terms match the name and description
|
|
||||||
const aScore = searchTerms.filter(term =>
|
|
||||||
a.name.toLowerCase().includes(term) ||
|
|
||||||
a.description.toLowerCase().includes(term)
|
|
||||||
).length;
|
|
||||||
const bScore = searchTerms.filter(term =>
|
|
||||||
b.name.toLowerCase().includes(term) ||
|
|
||||||
b.description.toLowerCase().includes(term)
|
|
||||||
).length;
|
|
||||||
return bScore - aScore;
|
|
||||||
})
|
|
||||||
.slice(0, 10);
|
|
||||||
|
|
||||||
console.log(`Transformed into ${businesses.length} business entries`);
|
const emitter = await searchHandler.searchAndAnswer(
|
||||||
return businesses;
|
body.query,
|
||||||
} catch (error) {
|
history,
|
||||||
console.error('SearxNG search error:', error);
|
llm,
|
||||||
return [];
|
embeddings,
|
||||||
}
|
body.optimizationMode,
|
||||||
}
|
[],
|
||||||
|
);
|
||||||
|
|
||||||
async function cacheResults(query: string, results: Business[]): Promise<void> {
|
let message = '';
|
||||||
if (!results.length) return;
|
let sources = [];
|
||||||
|
|
||||||
console.log(`Caching ${results.length} results for query:`, query);
|
emitter.on('data', (data) => {
|
||||||
const normalizedQuery = query.toLowerCase().trim();
|
const parsedData = JSON.parse(data);
|
||||||
|
if (parsedData.type === 'response') {
|
||||||
const { data: existing } = await supabase
|
message += parsedData.data;
|
||||||
.from('search_cache')
|
} else if (parsedData.type === 'sources') {
|
||||||
.select('id, results')
|
sources = parsedData.data;
|
||||||
.eq('query', normalizedQuery)
|
}
|
||||||
.single();
|
});
|
||||||
|
|
||||||
if (existing) {
|
emitter.on('end', () => {
|
||||||
console.log('Updating existing cache entry');
|
res.status(200).json({ message, sources });
|
||||||
// Merge new results with existing ones, removing duplicates
|
});
|
||||||
const allResults = [...existing.results, ...results];
|
|
||||||
const uniqueResults = Array.from(new Map(allResults.map(item => [item.id, item])).values());
|
|
||||||
|
|
||||||
await supabase
|
emitter.on('error', (data) => {
|
||||||
.from('search_cache')
|
const parsedData = JSON.parse(data);
|
||||||
.update({
|
res.status(500).json({ message: parsedData.data });
|
||||||
results: uniqueResults,
|
});
|
||||||
updated_at: new Date().toISOString()
|
} catch (err: any) {
|
||||||
})
|
logger.error(`Error in getting search results: ${err.message}`);
|
||||||
.eq('id', existing.id);
|
res.status(500).json({ message: 'An error has occurred.' });
|
||||||
} else {
|
|
||||||
console.log('Creating new cache entry');
|
|
||||||
await supabase
|
|
||||||
.from('search_cache')
|
|
||||||
.insert({
|
|
||||||
query: normalizedQuery,
|
|
||||||
results,
|
|
||||||
location: 'denver', // Default location
|
|
||||||
category: 'business', // Default category
|
|
||||||
created_at: new Date().toISOString(),
|
|
||||||
updated_at: new Date().toISOString(),
|
|
||||||
expires_at: new Date(Date.now() + 7 * 24 * 60 * 60 * 1000).toISOString() // 7 days from now
|
|
||||||
});
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function cleanBusinessName(title: string): string {
|
|
||||||
return title
|
|
||||||
.replace(/^(the\s+)?/i, '')
|
|
||||||
.replace(/\s*[-|]\s*.+$/i, '')
|
|
||||||
.replace(/\s*\|.*$/i, '')
|
|
||||||
.replace(/\s*in\s+denver.*$/i, '')
|
|
||||||
.replace(/\s*near\s+denver.*$/i, '')
|
|
||||||
.replace(/\s*-\s*.*denver.*$/i, '')
|
|
||||||
.trim();
|
|
||||||
}
|
|
||||||
|
|
||||||
function extractPhone(text: string): string | null {
|
|
||||||
const phoneRegex = /(\+?1?\s*\(?[0-9]{3}\)?[-.\s]?[0-9]{3}[-.\s]?[0-9]{4})/;
|
|
||||||
const match = text.match(phoneRegex);
|
|
||||||
return match ? match[1] : null;
|
|
||||||
}
|
|
||||||
|
|
||||||
function extractAddress(text: string): string | null {
|
|
||||||
const addressRegex = /\d+\s+[A-Za-z0-9\s,]+(?:Street|St|Avenue|Ave|Road|Rd|Boulevard|Blvd|Lane|Ln|Drive|Dr|Way|Court|Ct|Circle|Cir)[,\s]+(?:[A-Za-z\s]+,\s*)?(?:CO|Colorado)[,\s]+\d{5}(?:-\d{4})?/i;
|
|
||||||
const match = text.match(addressRegex);
|
|
||||||
return match ? match[0] : null;
|
|
||||||
}
|
|
||||||
|
|
||||||
router.post('/search', async (req, res) => {
|
|
||||||
try {
|
|
||||||
console.log('Received search request:', req.body);
|
|
||||||
const { query } = searchSchema.parse(req.body);
|
|
||||||
await handleSearch(query, res);
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Search error:', error);
|
|
||||||
res.status(400).json({ error: 'Search failed. Please try again.' });
|
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
// Also support GET requests for easier testing
|
|
||||||
router.get('/search', async (req, res) => {
|
|
||||||
try {
|
|
||||||
const query = req.query.q as string;
|
|
||||||
if (!query) {
|
|
||||||
return res.status(400).json({ error: 'Query parameter "q" is required' });
|
|
||||||
}
|
|
||||||
console.log('Received search request:', { query });
|
|
||||||
await handleSearch(query, res);
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Search error:', error);
|
|
||||||
res.status(400).json({ error: 'Search failed. Please try again.' });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
// Helper function to handle search logic
|
|
||||||
async function handleSearch(query: string, res: ExpressResponse) {
|
|
||||||
// Get cached results immediately
|
|
||||||
const cachedResults = await getCachedResults(query);
|
|
||||||
console.log(`Returning ${cachedResults.length} cached results to client`);
|
|
||||||
|
|
||||||
// Send cached results to client
|
|
||||||
res.json({ results: cachedResults });
|
|
||||||
|
|
||||||
// Search for new results in the background
|
|
||||||
console.log('Starting background search');
|
|
||||||
searchSearxNG(query).then(async newResults => {
|
|
||||||
console.log(`Found ${newResults.length} new results from SearxNG`);
|
|
||||||
if (newResults.length > 0) {
|
|
||||||
await cacheResults(query, newResults);
|
|
||||||
}
|
|
||||||
}).catch(error => {
|
|
||||||
console.error('Background search error:', error);
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
export default router;
|
export default router;
|
||||||
|
@@ -5,14 +5,17 @@ import { getAvailableChatModelProviders } from '../lib/providers';
|
|||||||
import { HumanMessage, AIMessage } from '@langchain/core/messages';
|
import { HumanMessage, AIMessage } from '@langchain/core/messages';
|
||||||
import logger from '../utils/logger';
|
import logger from '../utils/logger';
|
||||||
import { ChatOpenAI } from '@langchain/openai';
|
import { ChatOpenAI } from '@langchain/openai';
|
||||||
|
import {
|
||||||
|
getCustomOpenaiApiKey,
|
||||||
|
getCustomOpenaiApiUrl,
|
||||||
|
getCustomOpenaiModelName,
|
||||||
|
} from '../config';
|
||||||
|
|
||||||
const router = express.Router();
|
const router = express.Router();
|
||||||
|
|
||||||
interface ChatModel {
|
interface ChatModel {
|
||||||
provider: string;
|
provider: string;
|
||||||
model: string;
|
model: string;
|
||||||
customOpenAIBaseURL?: string;
|
|
||||||
customOpenAIKey?: string;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
interface SuggestionsBody {
|
interface SuggestionsBody {
|
||||||
@@ -43,21 +46,12 @@ router.post('/', async (req, res) => {
|
|||||||
let llm: BaseChatModel | undefined;
|
let llm: BaseChatModel | undefined;
|
||||||
|
|
||||||
if (body.chatModel?.provider === 'custom_openai') {
|
if (body.chatModel?.provider === 'custom_openai') {
|
||||||
if (
|
|
||||||
!body.chatModel?.customOpenAIBaseURL ||
|
|
||||||
!body.chatModel?.customOpenAIKey
|
|
||||||
) {
|
|
||||||
return res
|
|
||||||
.status(400)
|
|
||||||
.json({ message: 'Missing custom OpenAI base URL or key' });
|
|
||||||
}
|
|
||||||
|
|
||||||
llm = new ChatOpenAI({
|
llm = new ChatOpenAI({
|
||||||
modelName: body.chatModel.model,
|
modelName: getCustomOpenaiModelName(),
|
||||||
openAIApiKey: body.chatModel.customOpenAIKey,
|
openAIApiKey: getCustomOpenaiApiKey(),
|
||||||
temperature: 0.7,
|
temperature: 0.7,
|
||||||
configuration: {
|
configuration: {
|
||||||
baseURL: body.chatModel.customOpenAIBaseURL,
|
baseURL: getCustomOpenaiApiUrl(),
|
||||||
},
|
},
|
||||||
}) as unknown as BaseChatModel;
|
}) as unknown as BaseChatModel;
|
||||||
} else if (
|
} else if (
|
||||||
|
@@ -5,14 +5,17 @@ import { HumanMessage, AIMessage } from '@langchain/core/messages';
|
|||||||
import logger from '../utils/logger';
|
import logger from '../utils/logger';
|
||||||
import handleVideoSearch from '../chains/videoSearchAgent';
|
import handleVideoSearch from '../chains/videoSearchAgent';
|
||||||
import { ChatOpenAI } from '@langchain/openai';
|
import { ChatOpenAI } from '@langchain/openai';
|
||||||
|
import {
|
||||||
|
getCustomOpenaiApiKey,
|
||||||
|
getCustomOpenaiApiUrl,
|
||||||
|
getCustomOpenaiModelName,
|
||||||
|
} from '../config';
|
||||||
|
|
||||||
const router = express.Router();
|
const router = express.Router();
|
||||||
|
|
||||||
interface ChatModel {
|
interface ChatModel {
|
||||||
provider: string;
|
provider: string;
|
||||||
model: string;
|
model: string;
|
||||||
customOpenAIBaseURL?: string;
|
|
||||||
customOpenAIKey?: string;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
interface VideoSearchBody {
|
interface VideoSearchBody {
|
||||||
@@ -44,21 +47,12 @@ router.post('/', async (req, res) => {
|
|||||||
let llm: BaseChatModel | undefined;
|
let llm: BaseChatModel | undefined;
|
||||||
|
|
||||||
if (body.chatModel?.provider === 'custom_openai') {
|
if (body.chatModel?.provider === 'custom_openai') {
|
||||||
if (
|
|
||||||
!body.chatModel?.customOpenAIBaseURL ||
|
|
||||||
!body.chatModel?.customOpenAIKey
|
|
||||||
) {
|
|
||||||
return res
|
|
||||||
.status(400)
|
|
||||||
.json({ message: 'Missing custom OpenAI base URL or key' });
|
|
||||||
}
|
|
||||||
|
|
||||||
llm = new ChatOpenAI({
|
llm = new ChatOpenAI({
|
||||||
modelName: body.chatModel.model,
|
modelName: getCustomOpenaiModelName(),
|
||||||
openAIApiKey: body.chatModel.customOpenAIKey,
|
openAIApiKey: getCustomOpenaiApiKey(),
|
||||||
temperature: 0.7,
|
temperature: 0.7,
|
||||||
configuration: {
|
configuration: {
|
||||||
baseURL: body.chatModel.customOpenAIBaseURL,
|
baseURL: getCustomOpenaiApiUrl(),
|
||||||
},
|
},
|
||||||
}) as unknown as BaseChatModel;
|
}) as unknown as BaseChatModel;
|
||||||
} else if (
|
} else if (
|
||||||
|
@@ -17,7 +17,12 @@ import LineListOutputParser from '../lib/outputParsers/listLineOutputParser';
|
|||||||
import LineOutputParser from '../lib/outputParsers/lineOutputParser';
|
import LineOutputParser from '../lib/outputParsers/lineOutputParser';
|
||||||
import { getDocumentsFromLinks } from '../utils/documents';
|
import { getDocumentsFromLinks } from '../utils/documents';
|
||||||
import { Document } from 'langchain/document';
|
import { Document } from 'langchain/document';
|
||||||
import { searchSearxng } from '../lib/searxng';
|
import { searchSearxng } from '../lib/searchEngines/searxng';
|
||||||
|
import { searchGooglePSE } from '../lib/searchEngines/google_pse';
|
||||||
|
import { searchBingAPI } from '../lib/searchEngines/bing';
|
||||||
|
import { searchBraveAPI } from '../lib/searchEngines/brave';
|
||||||
|
import { searchYaCy } from '../lib/searchEngines/yacy';
|
||||||
|
import { getSearchEngineBackend } from '../config';
|
||||||
import path from 'path';
|
import path from 'path';
|
||||||
import fs from 'fs';
|
import fs from 'fs';
|
||||||
import computeSimilarity from '../utils/computeSimilarity';
|
import computeSimilarity from '../utils/computeSimilarity';
|
||||||
@@ -132,7 +137,7 @@ class MetaSearchAgent implements MetaSearchAgentType {
|
|||||||
You are a web search summarizer, tasked with summarizing a piece of text retrieved from a web search. Your job is to summarize the
|
You are a web search summarizer, tasked with summarizing a piece of text retrieved from a web search. Your job is to summarize the
|
||||||
text into a detailed, 2-4 paragraph explanation that captures the main ideas and provides a comprehensive answer to the query.
|
text into a detailed, 2-4 paragraph explanation that captures the main ideas and provides a comprehensive answer to the query.
|
||||||
If the query is \"summarize\", you should provide a detailed summary of the text. If the query is a specific question, you should answer it in the summary.
|
If the query is \"summarize\", you should provide a detailed summary of the text. If the query is a specific question, you should answer it in the summary.
|
||||||
|
|
||||||
- **Journalistic tone**: The summary should sound professional and journalistic, not too casual or vague.
|
- **Journalistic tone**: The summary should sound professional and journalistic, not too casual or vague.
|
||||||
- **Thorough and detailed**: Ensure that every key point from the text is captured and that the summary directly answers the query.
|
- **Thorough and detailed**: Ensure that every key point from the text is captured and that the summary directly answers the query.
|
||||||
- **Not too lengthy, but detailed**: The summary should be informative but not excessively long. Focus on providing detailed information in a concise format.
|
- **Not too lengthy, but detailed**: The summary should be informative but not excessively long. Focus on providing detailed information in a concise format.
|
||||||
@@ -203,10 +208,37 @@ class MetaSearchAgent implements MetaSearchAgentType {
|
|||||||
|
|
||||||
return { query: question, docs: docs };
|
return { query: question, docs: docs };
|
||||||
} else {
|
} else {
|
||||||
const res = await searchSearxng(question, {
|
const searchEngine = getSearchEngineBackend();
|
||||||
language: 'en',
|
|
||||||
engines: this.config.activeEngines,
|
let res;
|
||||||
});
|
switch (searchEngine) {
|
||||||
|
case 'searxng':
|
||||||
|
res = await searchSearxng(question, {
|
||||||
|
language: 'en',
|
||||||
|
engines: this.config.activeEngines,
|
||||||
|
});
|
||||||
|
break;
|
||||||
|
case 'google':
|
||||||
|
res = await searchGooglePSE(question);
|
||||||
|
break;
|
||||||
|
case 'bing':
|
||||||
|
res = await searchBingAPI(question);
|
||||||
|
break;
|
||||||
|
case 'brave':
|
||||||
|
res = await searchBraveAPI(question);
|
||||||
|
break;
|
||||||
|
case 'yacy':
|
||||||
|
res = await searchYaCy(question);
|
||||||
|
break;
|
||||||
|
default:
|
||||||
|
throw new Error(`Unknown search engine ${searchEngine}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!res?.results) {
|
||||||
|
throw new Error(
|
||||||
|
`No results found for search engine: ${searchEngine}`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
const documents = res.results.map(
|
const documents = res.results.map(
|
||||||
(result) =>
|
(result) =>
|
||||||
|
@@ -1,21 +0,0 @@
|
|||||||
import express from 'express';
|
|
||||||
import cors from 'cors';
|
|
||||||
import { env } from './config/env';
|
|
||||||
import app from './app';
|
|
||||||
import { HealthCheckService } from './lib/services/healthCheck';
|
|
||||||
|
|
||||||
const port = env.PORT || 3000;
|
|
||||||
|
|
||||||
// Health check endpoint
|
|
||||||
app.get('/health', async (req, res) => {
|
|
||||||
const health = await HealthCheckService.checkHealth();
|
|
||||||
res.json(health);
|
|
||||||
});
|
|
||||||
|
|
||||||
export function startServer() {
|
|
||||||
return app.listen(port, () => {
|
|
||||||
console.log(`Server is running on port ${port}`);
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
export default app;
|
|
@@ -1,3 +0,0 @@
|
|||||||
@tailwind base;
|
|
||||||
@tailwind components;
|
|
||||||
@tailwind utilities;
|
|
@@ -1,102 +0,0 @@
|
|||||||
import { createClient } from '@supabase/supabase-js';
|
|
||||||
import dotenv from 'dotenv';
|
|
||||||
|
|
||||||
// Load environment variables
|
|
||||||
dotenv.config();
|
|
||||||
|
|
||||||
async function testSupabaseConnection() {
|
|
||||||
console.log('Testing Supabase connection...');
|
|
||||||
console.log('URL:', process.env.SUPABASE_URL);
|
|
||||||
console.log('Key length:', process.env.SUPABASE_KEY?.length || 0);
|
|
||||||
|
|
||||||
try {
|
|
||||||
const supabase = createClient(
|
|
||||||
process.env.SUPABASE_URL!,
|
|
||||||
process.env.SUPABASE_KEY!,
|
|
||||||
{
|
|
||||||
auth: {
|
|
||||||
autoRefreshToken: true,
|
|
||||||
persistSession: true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
);
|
|
||||||
|
|
||||||
// Test businesses table
|
|
||||||
console.log('\nTesting businesses table:');
|
|
||||||
const testBusiness = {
|
|
||||||
id: 'test_' + Date.now(),
|
|
||||||
name: 'Test Business',
|
|
||||||
phone: '123-456-7890',
|
|
||||||
email: 'test@example.com',
|
|
||||||
address: '123 Test St',
|
|
||||||
rating: 5,
|
|
||||||
website: 'https://test.com',
|
|
||||||
source: 'test',
|
|
||||||
description: 'Test description',
|
|
||||||
latitude: 39.7392,
|
|
||||||
longitude: -104.9903,
|
|
||||||
search_count: 1,
|
|
||||||
created_at: new Date().toISOString()
|
|
||||||
};
|
|
||||||
|
|
||||||
const { error: insertBusinessError } = await supabase
|
|
||||||
.from('businesses')
|
|
||||||
.insert([testBusiness])
|
|
||||||
.select();
|
|
||||||
|
|
||||||
if (insertBusinessError) {
|
|
||||||
console.error('❌ INSERT business error:', insertBusinessError);
|
|
||||||
} else {
|
|
||||||
console.log('✅ INSERT business OK');
|
|
||||||
// Clean up
|
|
||||||
await supabase.from('businesses').delete().eq('id', testBusiness.id);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Test searches table
|
|
||||||
console.log('\nTesting searches table:');
|
|
||||||
const testSearch = {
|
|
||||||
query: 'test query',
|
|
||||||
location: 'test location',
|
|
||||||
results_count: 0,
|
|
||||||
timestamp: new Date().toISOString()
|
|
||||||
};
|
|
||||||
|
|
||||||
const { error: insertSearchError } = await supabase
|
|
||||||
.from('searches')
|
|
||||||
.insert([testSearch])
|
|
||||||
.select();
|
|
||||||
|
|
||||||
if (insertSearchError) {
|
|
||||||
console.error('❌ INSERT search error:', insertSearchError);
|
|
||||||
} else {
|
|
||||||
console.log('✅ INSERT search OK');
|
|
||||||
}
|
|
||||||
|
|
||||||
// Test cache table
|
|
||||||
console.log('\nTesting cache table:');
|
|
||||||
const testCache = {
|
|
||||||
key: 'test_key_' + Date.now(),
|
|
||||||
value: { test: true },
|
|
||||||
created_at: new Date().toISOString(),
|
|
||||||
expires_at: new Date(Date.now() + 3600000).toISOString()
|
|
||||||
};
|
|
||||||
|
|
||||||
const { error: insertCacheError } = await supabase
|
|
||||||
.from('cache')
|
|
||||||
.insert([testCache])
|
|
||||||
.select();
|
|
||||||
|
|
||||||
if (insertCacheError) {
|
|
||||||
console.error('❌ INSERT cache error:', insertCacheError);
|
|
||||||
} else {
|
|
||||||
console.log('✅ INSERT cache OK');
|
|
||||||
// Clean up
|
|
||||||
await supabase.from('cache').delete().eq('key', testCache.key);
|
|
||||||
}
|
|
||||||
|
|
||||||
} catch (error: any) {
|
|
||||||
console.error('❌ Unexpected error:', error);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
testSupabaseConnection().catch(console.error);
|
|
@@ -1,139 +0,0 @@
|
|||||||
import { createClient } from '@supabase/supabase-js';
|
|
||||||
|
|
||||||
// Mock data type
|
|
||||||
type MockData = {
|
|
||||||
businesses: { id: string; name: string };
|
|
||||||
cache: { key: string; value: { test: boolean } };
|
|
||||||
};
|
|
||||||
|
|
||||||
// Mock Supabase client
|
|
||||||
jest.mock('@supabase/supabase-js', () => ({
|
|
||||||
createClient: jest.fn(() => ({
|
|
||||||
from: jest.fn((table: keyof MockData) => {
|
|
||||||
const mockData: MockData = {
|
|
||||||
businesses: { id: 'test_1', name: 'Test Business' },
|
|
||||||
cache: { key: 'test_key', value: { test: true } }
|
|
||||||
};
|
|
||||||
|
|
||||||
return {
|
|
||||||
insert: jest.fn(() => ({
|
|
||||||
select: jest.fn().mockResolvedValue({
|
|
||||||
data: [mockData[table]],
|
|
||||||
error: null
|
|
||||||
})
|
|
||||||
})),
|
|
||||||
select: jest.fn(() => ({
|
|
||||||
eq: jest.fn(() => ({
|
|
||||||
single: jest.fn().mockResolvedValue({
|
|
||||||
data: mockData[table],
|
|
||||||
error: null
|
|
||||||
}),
|
|
||||||
gt: jest.fn(() => ({
|
|
||||||
single: jest.fn().mockResolvedValue({
|
|
||||||
data: null,
|
|
||||||
error: null
|
|
||||||
})
|
|
||||||
}))
|
|
||||||
}))
|
|
||||||
})),
|
|
||||||
update: jest.fn(() => ({
|
|
||||||
eq: jest.fn().mockResolvedValue({
|
|
||||||
error: null
|
|
||||||
})
|
|
||||||
})),
|
|
||||||
delete: jest.fn(() => ({
|
|
||||||
eq: jest.fn().mockResolvedValue({
|
|
||||||
error: null
|
|
||||||
})
|
|
||||||
}))
|
|
||||||
};
|
|
||||||
})
|
|
||||||
}))
|
|
||||||
}));
|
|
||||||
|
|
||||||
describe('Database Operations', () => {
|
|
||||||
const supabase = createClient('test-url', 'test-key');
|
|
||||||
|
|
||||||
const testBusiness = {
|
|
||||||
id: `test_${Date.now()}`,
|
|
||||||
name: 'Test Business',
|
|
||||||
phone: '(303) 555-1234',
|
|
||||||
email: 'test@example.com',
|
|
||||||
address: '123 Test St, Denver, CO 80202',
|
|
||||||
rating: 5,
|
|
||||||
website: 'https://test.com',
|
|
||||||
source: 'test',
|
|
||||||
description: 'Test description',
|
|
||||||
location: { lat: 39.7392, lng: -104.9903 },
|
|
||||||
search_count: 1,
|
|
||||||
created_at: new Date().toISOString()
|
|
||||||
};
|
|
||||||
|
|
||||||
beforeEach(() => {
|
|
||||||
jest.clearAllMocks();
|
|
||||||
});
|
|
||||||
|
|
||||||
describe('Business Operations', () => {
|
|
||||||
it('should insert a business successfully', async () => {
|
|
||||||
const { data, error } = await supabase
|
|
||||||
.from('businesses')
|
|
||||||
.insert([testBusiness])
|
|
||||||
.select();
|
|
||||||
|
|
||||||
expect(error).toBeNull();
|
|
||||||
expect(data).toBeTruthy();
|
|
||||||
expect(data![0].name).toBe('Test Business');
|
|
||||||
});
|
|
||||||
|
|
||||||
it('should retrieve a business by id', async () => {
|
|
||||||
const { data, error } = await supabase
|
|
||||||
.from('businesses')
|
|
||||||
.select()
|
|
||||||
.eq('id', testBusiness.id)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
expect(error).toBeNull();
|
|
||||||
expect(data).toBeTruthy();
|
|
||||||
expect(data.name).toBe('Test Business');
|
|
||||||
});
|
|
||||||
|
|
||||||
it('should update a business', async () => {
|
|
||||||
const { error } = await supabase
|
|
||||||
.from('businesses')
|
|
||||||
.update({ name: 'Updated Test Business' })
|
|
||||||
.eq('id', testBusiness.id);
|
|
||||||
|
|
||||||
expect(error).toBeNull();
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe('Cache Operations', () => {
|
|
||||||
const testCache = {
|
|
||||||
key: `test_key_${Date.now()}`,
|
|
||||||
value: { test: true },
|
|
||||||
created_at: new Date().toISOString(),
|
|
||||||
expires_at: new Date(Date.now() + 3600000).toISOString()
|
|
||||||
};
|
|
||||||
|
|
||||||
it('should insert cache entry', async () => {
|
|
||||||
const { data, error } = await supabase
|
|
||||||
.from('cache')
|
|
||||||
.insert([testCache])
|
|
||||||
.select();
|
|
||||||
|
|
||||||
expect(error).toBeNull();
|
|
||||||
expect(data).toBeTruthy();
|
|
||||||
});
|
|
||||||
|
|
||||||
it('should retrieve cache entry', async () => {
|
|
||||||
const { data, error } = await supabase
|
|
||||||
.from('cache')
|
|
||||||
.select()
|
|
||||||
.eq('key', testCache.key)
|
|
||||||
.single();
|
|
||||||
|
|
||||||
expect(error).toBeNull();
|
|
||||||
expect(data.value).toEqual({ test: true });
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
@@ -1,92 +0,0 @@
|
|||||||
import { DeepSeekService } from '../../lib/services/deepseekService';
|
|
||||||
import { Business } from '../../lib/types';
|
|
||||||
|
|
||||||
// Mock the DeepSeek service
|
|
||||||
jest.mock('../../lib/services/deepseekService', () => {
|
|
||||||
const mockCleanedBusiness = {
|
|
||||||
name: "Denver's Best Plumbing & Repair",
|
|
||||||
address: "1234 Main Street, Denver, CO 80202",
|
|
||||||
phone: "(720) 555-1234",
|
|
||||||
email: "support@denverplumbing.com",
|
|
||||||
description: "Professional plumbing services in Denver metro area"
|
|
||||||
};
|
|
||||||
|
|
||||||
return {
|
|
||||||
DeepSeekService: {
|
|
||||||
chat: jest.fn().mockResolvedValue(JSON.stringify({
|
|
||||||
business_info: mockCleanedBusiness
|
|
||||||
})),
|
|
||||||
detectBusinessType: jest.fn().mockReturnValue('service'),
|
|
||||||
sanitizeJsonResponse: jest.fn().mockReturnValue(mockCleanedBusiness),
|
|
||||||
manualClean: jest.fn().mockReturnValue(mockCleanedBusiness),
|
|
||||||
cleanBusinessData: jest.fn().mockResolvedValue(mockCleanedBusiness)
|
|
||||||
}
|
|
||||||
};
|
|
||||||
});
|
|
||||||
|
|
||||||
describe('DeepSeekService', () => {
|
|
||||||
describe('cleanBusinessData', () => {
|
|
||||||
const testBusiness: Business = {
|
|
||||||
id: 'test_1',
|
|
||||||
name: "Denver's Best Plumbing & Repair [LLC] (A Family Business)",
|
|
||||||
address: "Suite 200-B, 1234 Main Street, Denver, Colorado 80202",
|
|
||||||
phone: "(720) 555-1234",
|
|
||||||
email: "support@denverplumbing.com",
|
|
||||||
description: "Professional plumbing services in Denver metro area",
|
|
||||||
source: 'test',
|
|
||||||
website: 'https://example.com',
|
|
||||||
rating: 4.8,
|
|
||||||
location: { lat: 39.7392, lng: -104.9903 },
|
|
||||||
openingHours: []
|
|
||||||
};
|
|
||||||
|
|
||||||
beforeEach(() => {
|
|
||||||
jest.clearAllMocks();
|
|
||||||
});
|
|
||||||
|
|
||||||
it('should clean business name correctly', async () => {
|
|
||||||
const cleaned = await DeepSeekService.cleanBusinessData(testBusiness);
|
|
||||||
expect(cleaned.name).not.toMatch(/[\[\]{}()]/);
|
|
||||||
expect(cleaned.name).toBeTruthy();
|
|
||||||
});
|
|
||||||
|
|
||||||
it('should format phone number correctly', async () => {
|
|
||||||
const cleaned = await DeepSeekService.cleanBusinessData(testBusiness);
|
|
||||||
expect(cleaned.phone).toMatch(/^\(\d{3}\) \d{3}-\d{4}$/);
|
|
||||||
});
|
|
||||||
|
|
||||||
it('should clean email address', async () => {
|
|
||||||
const cleaned = await DeepSeekService.cleanBusinessData(testBusiness);
|
|
||||||
expect(cleaned.email).not.toMatch(/[\[\]<>()]|mailto:|click|schedule/i);
|
|
||||||
expect(cleaned.email).toMatch(/^[^\s@]+@[^\s@]+\.[^\s@]+$/);
|
|
||||||
});
|
|
||||||
|
|
||||||
it('should clean description', async () => {
|
|
||||||
const cleaned = await DeepSeekService.cleanBusinessData(testBusiness);
|
|
||||||
expect(cleaned.description).not.toMatch(/[\$\d]+%?\s*off|\$/i);
|
|
||||||
expect(cleaned.description).not.toMatch(/\b(?:call|email|visit|contact|text|www\.|http|@)\b/i);
|
|
||||||
expect(cleaned.description).not.toMatch(/[📞📧🌐💳☎️📱]/);
|
|
||||||
expect(cleaned.description).not.toMatch(/#\w+/);
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe('chat', () => {
|
|
||||||
it('should return a response from the model', async () => {
|
|
||||||
const response = await DeepSeekService['chat']([{
|
|
||||||
role: 'user',
|
|
||||||
content: 'Test message'
|
|
||||||
}]);
|
|
||||||
expect(response).toBeTruthy();
|
|
||||||
expect(typeof response).toBe('string');
|
|
||||||
});
|
|
||||||
|
|
||||||
it('should handle errors gracefully', async () => {
|
|
||||||
(DeepSeekService['chat'] as jest.Mock).mockRejectedValueOnce(new Error('Test error'));
|
|
||||||
|
|
||||||
await expect(DeepSeekService['chat']([{
|
|
||||||
role: 'user',
|
|
||||||
content: 'Test message'
|
|
||||||
}])).rejects.toThrow('Test error');
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user