Do I need to know how to code?

No — packages 1, 2, 3, and 4 include CSV and Excel formats, so you can open them in Excel, Google Sheets, or any BI tool. Packages 5, 6, and 7 are designed for Python / R / SQL users.

What format should I choose?

If you're an Excel user or just getting started — pick Package 1 or 4 (CSV + Excel). If you're a bettor building a model — pick Package 2 or 3. Python / R data scientist — Package 5 or 6 (Parquet). Want everything in every format, including a pre-loaded SQLite — Package 7 (Complete Diamond).

How do I open a parquet file?

In Python: pd.read_parquet("file.parquet"). In R: arrow::read_parquet(). In SQL: DuckDB reads parquet directly. In Excel: convert via the package's CSV copy (every parquet file in packages 3, 4, 7 has a matching CSV).

How big are the downloads?

Package 1 ≈ 28 MB · Package 2 ≈ 27 MB · Package 3 ≈ 47 MB · Package 4 ≈ 93 MB · Package 5 ≈ 2.7 GB · Package 6 ≈ 3.1 GB · Package 7 ≈ 6.5 GB. The big ones include all 12 seasons of Statcast pitch-by-pitch.

How do I get updates?

Every package is updated annually each November. The latest version is always available at current pricing. Buying the current version gets you the current version — future annual releases are separate purchases.

What changes in each annual update?

The new season is added, mid-season corrections from upstream sources are pulled in, derived stats (career_war_cumulative, head-to-head matchups, velocity fatigue curves, etc.) are recomputed, and any data dictionary updates are bundled in. The schema stays stable — your existing scripts keep working when you upgrade to the latest version.

Can I re-download my purchase?

Your download link is in your receipt email — keep it safe. If you have issues contact Payhip support directly at payhip.com — we do not have access to your payment or account information.

Can I share my download link?

Please don't. Each purchase is a single-user license. Download links are personal to your purchase. Sharing your link violates our terms.

What if I lose my receipt email?

Contact Payhip support directly at payhip.com — they can help you recover access. We do not have access to your payment or account information and cannot manually resend links.

Is this legal to use?

Yes. All data is compiled from publicly available sources. What you're paying for is the work product: cleaning, deduplicating, joining, enriching, and packaging in usable formats. See the disclaimer at the bottom of every page.

Can I use this commercially?

Yes for research, modeling, betting, fantasy, internal analytics, and editorial. Not for resale of the raw data itself. You can publish derived insights, charts, models, or articles based on the data, but you cannot bundle our files and resell them as a competing dataset product.

Usage terms (the short version)

One purchase = a non-transferable license for you (or a single team / company) to use the data internally and to publish derived work. No redistribution of the raw files. No reselling. No exclusivity claims. Attribution is appreciated but not required for derivative work.

Disclaimer

"Data compiled and enriched from publicly available sources for research and analytical purposes. Not affiliated with or endorsed by any sport data provider or league. All derived features, enrichments, and computed statistics are original work product."

What is your refund policy?

All sales are final. These are digital data files — once accessed or downloaded we cannot un-deliver the data, so we do not offer refunds under any circumstances. We strongly recommend downloading the free sample before purchasing to confirm the data meets your needs. If you experience a technical issue with your download contact [email protected] and we will make it right.

What if my file is corrupted or broken?

If a file is genuinely corrupted or fails to open, contact us at [email protected] with details. We will verify the issue and provide a working file. This is the only exception to our no-refunds policy — and it applies to technical delivery failures only, not change-of-mind purchases.

What if I need a custom dataset?

Contact us — we can discuss custom extracts, alternative formats (DuckDB, BigQuery, Snowflake), or adding fields to one of the existing bundles. Pricing is quoted per request.

Do you do team / institutional licenses?

Yes — single-purchase covers an internal team or research group. For full multi-seat redistribution rights, contact us for an enterprise license.

Will you add more sports?

Yes — more sports are planned. The exact timeline depends on data availability. Keep your receipt email to stay informed, or check rawsportsvault.com for announcements.

How do I contact support?

Email [email protected] for data questions, custom-dataset requests, and broken-file reports. For download-link, receipt, or payment issues, contact payhip.com directly — we do not have access to your payment or account information. We answer within one business day.

Try the data first

No email gate. Just the .zip.

Download free sample