ogie

Extract OpenGraph, Twitter Cards, and metadata from URLs or HTML, with dual extraction modes and social diagnostics for validator workflows.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ogie" with this command: npx skills add dobroslavradosavljevic/ogie/dobroslavradosavljevic-ogie-ogie

OpenGraph & Metadata Extraction

Use this skill when helping users extract metadata from webpages, build link previews, create SEO tools, parse OpenGraph/Twitter Card data, or run social metadata validation audits.

Quick Start

import { extract } from "ogie";

const result = await extract("https://github.com");

if (result.success) {
  console.log(result.data.og.title);
  console.log(result.data.og.description);
  console.log(result.data.og.images[0]?.url);
}

Core Functions

extract(url, options?)

Fetch and extract metadata from a URL.

import { extract } from "ogie";

const result = await extract("https://example.com", {
  timeout: 10000,
  maxRedirects: 5,
  userAgent: "MyBot/1.0",
  fetchOEmbed: true,
  convertCharset: true,
});

if (result.success) {
  console.log(result.data.og.title);
  console.log(result.data.twitter.card);
  console.log(result.data.basic.favicon);
}

extractFromHtml(html, options?)

Extract metadata from an HTML string without network requests.

import { extractFromHtml } from "ogie";

const html = `
  <html>
  <head>
    <meta property="og:title" content="My Page">
    <meta property="og:image" content="/images/hero.jpg">
  </head>
  </html>
`;

const result = extractFromHtml(html, {
  baseUrl: "https://example.com", // Required for relative URLs
});

extractWithDiagnostics(url, options?)

Fetch metadata and return social diagnostics (valid, invalid, missing, warnings).

import { extractWithDiagnostics } from "ogie";

const result = await extractWithDiagnostics("https://example.com", {
  mode: "platform-valid",
});

if (result.success) {
  console.log(result.data.og.title);
  console.log(result.diagnostics.summary);
}

extractFromHtmlWithDiagnostics(html, options?)

Parse HTML and return metadata with social diagnostics.

import { extractFromHtmlWithDiagnostics } from "ogie";

const result = extractFromHtmlWithDiagnostics(html, {
  baseUrl: "https://example.com",
  mode: "platform-valid",
});

extractBulk(urls, options?)

Extract metadata from multiple URLs with rate limiting.

import { extractBulk } from "ogie";

const result = await extractBulk(
  ["https://github.com", "https://twitter.com", "https://youtube.com"],
  {
    concurrency: 10,
    concurrencyPerDomain: 3,
    minDelayPerDomain: 200,
    onProgress: (p) => console.log(`${p.completed}/${p.total}`),
  }
);

for (const item of result.results) {
  if (item.result.success) {
    console.log(`${item.url}: ${item.result.data.og.title}`);
  }
}

createCache(options?)

Create an LRU cache for extraction results.

import { extract, createCache } from "ogie";

const cache = createCache({
  maxSize: 100,
  ttl: 300_000, // 5 minutes
});

// First call fetches, second returns cached
await extract("https://github.com", { cache });
await extract("https://github.com", { cache }); // Instant

Extracted Metadata Types

Ogie extracts from 13 sources:

PropertyDescription
data.ogOpenGraph (title, images, etc.)
data.twitterTwitter Cards
data.basicHTML meta tags, favicon, title
data.articleArticle metadata (dates, author)
data.videoVideo metadata (actors, duration)
data.musicMusic metadata (album, duration)
data.bookBook metadata (ISBN, authors)
data.profileProfile metadata (name, gender)
data.jsonLdJSON-LD structured data
data.dublinCoreDublin Core metadata
data.appLinksApp Links for deep linking
data.oEmbedoEmbed data (if enabled)
data.oEmbedDiscoveryDiscovered oEmbed endpoints

Error Handling

import { extract, isFetchError, isParseError } from "ogie";

const result = await extract(url);

if (!result.success) {
  switch (result.error.code) {
    case "INVALID_URL":
    case "FETCH_ERROR":
    case "TIMEOUT":
    case "PARSE_ERROR":
    case "NO_HTML":
    case "REDIRECT_LIMIT":
      console.error(result.error.message);
  }

  if (isFetchError(result.error)) {
    console.log(`HTTP Status: ${result.error.statusCode}`);
  }
}

Options Reference

ExtractOptions

OptionTypeDefaultDescription
timeoutnumber10000Request timeout in ms
maxRedirectsnumber5Max redirects to follow
userAgentstringogie/2.0Custom User-Agent string
headersRecord<string, string>{}Custom HTTP headers
baseUrlstringBase URL for resolving relative paths
mode"best-effort" | "platform-valid""best-effort"Extraction mode behavior
onlyOpenGraphbooleanfalseLegacy: skip OG fallback parsing only
allowPrivateUrlsbooleanfalseAllow localhost/private IPs
fetchOEmbedbooleanfalseFetch oEmbed endpoint
convertCharsetbooleanfalseAuto charset detection
cacheMetadataCache | falseCache instance
bypassCachebooleanfalseForce fresh fetch

BulkOptions

OptionTypeDefaultDescription
concurrencynumber10Max parallel requests globally
concurrencyPerDomainnumber3Max parallel per domain
minDelayPerDomainnumber200Min ms between domain requests
requestsPerMinutenumber600Global rate limit
timeoutnumber30000Timeout per request
continueOnErrorbooleantrueContinue on failures
onProgressfunctionProgress callback
extractOptionsobjectOptions passed to each extract

Security

Ogie includes built-in protections:

  • SSRF protection (blocks private IPs by default)
  • URL validation (HTTP/HTTPS only)
  • Redirect limits (default: 5)
  • oEmbed endpoint validation
// Allow private URLs for local development
await extract("http://localhost:3000", {
  allowPrivateUrls: true,
});

Mode Selection

  • Use mode: "best-effort" when you want maximum metadata coverage for downstream processing (summaries, indexing, enrichment).
  • Use mode: "platform-valid" when you want only OG/Twitter values that pass strict social validation filters.
  • Use extractWithDiagnostics or extractFromHtmlWithDiagnostics when you need validator-style reporting (valid, invalid, missing, warnings).

Common Use Cases

Link Preview

const result = await extract(url);
if (result.success) {
  const preview = {
    title: result.data.og.title || result.data.basic.title,
    description: result.data.og.description || result.data.basic.description,
    image: result.data.og.images[0]?.url,
    siteName: result.data.og.siteName,
    favicon: result.data.basic.favicon,
  };
}

SEO Audit

import { extractWithDiagnostics } from "ogie";

const result = await extractWithDiagnostics(url, {
  mode: "platform-valid",
});
if (result.success) {
  console.log(result.diagnostics.missingRequiredFields);
  console.log(result.diagnostics.invalidFields);
  console.log(result.diagnostics.warnings);
}

Batch Processing

const urls = ["https://a.com", "https://b.com", "https://c.com"];
const result = await extractBulk(urls, {
  concurrency: 5,
  onProgress: (p) => console.log(`${p.succeeded}/${p.total} done`),
});
console.log(`Success rate: ${result.stats.succeeded}/${result.stats.total}`);

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

Smart Agent Memory CN

跨平台 Agent 长期记忆系统。分层上下文供给 + 温度模型 + Skill经验记忆 + 结构化存储 + 自动归档。三层存储:Markdown(人可读,QMD 可搜索)+ JSON(结构化)+ SQLite/FTS5(高性能全文搜索)。纯 Node.js 原生模块,零外部依赖。

Registry SourceRecently Updated
Automation

Agent Reader

Document beautifier for AI Agents. Converts Markdown to styled webpages, Word, PDF, and image slideshows — the 'last mile' rendering engine for AI output. 专为...

Registry SourceRecently Updated
650Profile unavailable
Automation

Feishu Calendar Intelligent Scheduler

飞书智能日历调度器 - 自动推荐最佳会议时间,批量管理日程,生成会议报表

Registry SourceRecently Updated
120Profile unavailable