Text Analysis

Text Similarity

Measures text similarity using Jaccard coefficient, comparing word overlap between two strings. Access via MCP in Cursor/Claude Code or GET /v1/text/similarity. Returns decimal score from 0 (no overlap) to 1 (identical). Perfect for content deduplication, document clustering, or plagiarism detection in AI workflows.

API Endpoint

GET /v1/text/similarity

Code Examples

curl "https://tinyfn.io/v1/text/similarity" \
  -H "X-API-Key: YOUR_API_KEY"
const response = await fetch('https://tinyfn.io/v1/text/similarity', {
  headers: { 'X-API-Key': 'YOUR_API_KEY' }
});
const data = await response.json();
console.log(data);
import requests

response = requests.get('https://tinyfn.io/v1/text/similarity',
    headers={'X-API-Key': 'YOUR_API_KEY'})
data = response.json()
print(data)

Use via MCP

Add to your AI agent

Connect your AI agent (Claude, Cursor, Windsurf, etc.) to TinyFn's text analysis tools:

{
  "mcpServers": {
    "tinyfn-text": {
      "url": "https://tinyfn.io/mcp/text",
      "headers": {
        "X-API-Key": "YOUR_API_KEY"
      }
    }
  }
}

Learn more about MCP setup →

FAQ

How does Jaccard similarity work for text comparison?

Jaccard similarity divides shared words by total unique words across both texts. For example, 'hello world' vs 'world peace' has 1 shared word out of 3 total unique words, giving 0.333.

What's the difference between Jaccard and cosine similarity?

Jaccard counts word presence/absence while cosine considers word frequency and position. Jaccard is simpler and faster for basic overlap detection, cosine better for semantic meaning.

Can I use this for duplicate content detection in MCP agents?

Yes, set a threshold like 0.7+ for likely duplicates. AI agents in Cursor can batch-compare documents and flag potential duplicates automatically using this deterministic calculation.

Does text similarity handle case sensitivity and punctuation?

The tool typically normalizes text by converting to lowercase and removing punctuation before comparison. Check the API response for preprocessing details specific to your use case.

What similarity score indicates two texts are related?

Scores above 0.5 suggest moderate similarity, 0.7+ indicates strong similarity. The threshold depends on your use case—news articles might need 0.3, while duplicate detection needs 0.8+.

Try Text Similarity Now

Get your free API key and start using Text Similarity in seconds.

Get Free API Key