Storage estimates
Flag: ⭐️ Modified: February 14, 2026 7:58 PM Created: September 17, 2025 3:27 PM Master Type: Notes Hide: No Starred: Yes Status: Unassigned
Yes, that helps a lot. Using your file sizes, here’s what the storage footprint looks like per item and at collection scale. I split it between what typically lives in the DAMS (TIFFs + JPGs) and what usually sits in your backup/originals storage (RAWs + PSDs), since you keep those outside the DAMS.
Per-item storage (using your sizes)
Per capture/view inputs
- RAW (Capture One original): ~150 MB each × ~3 = 450 MB per view
- PSD (processed master): 300–500 MB per view
- TIFF (delivery/preservation): 130–150 MB per output
- JPG (access): 5–7 MB per output
Scenarios
| Item type | Views captured | Positives generated | DAMS (TIFF+JPG) per item | Backups (RAW+PSD) per item | Total per item |
|---|---|---|---|---|---|
| Print, front only | 1 | 0 | 135–157 MB | 750–950 MB | 0.86–1.08 GB |
| Print, front + back | 2 | 0 | 270–314 MB | 1.50–1.90 GB | 1.73–2.16 GB |
| Negative, front+back with positives | 2 | 2 (front/back positives) | 540–628 MB | 1.50–1.90 GB | 2.0–2.47 GB |
Notes:
- For negatives, you capture the negative (front/back). The positive files are derivatives, so they add TIFF/JPG but no extra RAWs, and typically no extra PSDs beyond the original captures.
What that means at collection scale
I ran two mix scenarios (you can swap in your own percentages if you prefer):
- Conservative mix: 60% negatives, 30% single-sided prints, 10% double-sided prints
- Aggressive mix: 70% negatives, 20% single-sided prints, 10% double-sided prints
Totals below are ranges (min–max), in terabytes.
75,000 items
- DAMS (TIFF+JPG): ~28.0–32.6 TB
- Backups (RAW+PSD): ~91.2–115.5 TB
- Combined: ~119–148 TB
100,000 items
- DAMS (TIFF+JPG): ~37.3–43.4 TB (conservative) / 41.2–47.9 TB (aggressive)
- Backups (RAW+PSD): ~121.6–154.0 TB (conservative) / 128.7–163.1 TB (aggressive)
- Combined: ~159–197 TB (conservative) / 170–211 TB (aggressive)
200,000 items
- DAMS (TIFF+JPG): ~74.7–86.8 TB (conservative) / 82.4–95.8 TB (aggressive)
- Backups (RAW+PSD): ~243.2–308.0 TB (conservative) / 257.5–326.2 TB (aggressive)
- Combined: ~318–395 TB (conservative) / 340–422 TB (aggressive)
Read on your 200 TB Google Workspace pool
- The DAMS portion for 100k items is ~37–48 TB.
- The RAW+PSD backups for 100k items are the bigger piece at ~122–163 TB.
- Combined at 100k items: roughly 160–210 TB depending on the mix and the exact PSD/JPG sizes.
Translation:
- A 200 TB pooled Google Workspace plan can accommodate ~100k items if the mix skews closer to the conservative side and you’re careful.
- If your mix is closer to aggressive, or if other museum storage keeps growing, you’ll be near or over the 200 TB ceiling around 100k–120k items.
- At 200k items you will exceed 200 TB by a wide margin, regardless.
What to adjust (if you want more headroom)
- Keep JPGs out of cloud backups if the DAMS can generate them on demand. That only saves ~20–30 MB per negative item, so it’s small compared to RAW/PSD, but every bit helps.
- Revisit PSD retention after QA sign-off (controversial). If you trust the TIFFs as preservation masters and can reconstruct a workflow, retiring PSDs yields big savings (300–500 MB per view).
- Move RAW/PSD backups to cheaper object storage (Wasabi/Backblaze B2 with immutability) while keeping TIFFs in the DAMS. Your DAMS storage remains ~40–90 TB over time; your cheaper bucket carries the heavy RAW/PSD volume.
- Dedicate Google Workspace to “living” collections work and collaboration, not long-term cold backup. Keep it responsive; push the bulk to object storage that scales cleanly.
Quick recommendations based on your setup
- DAMS (TIFF + derived JPG previews): Pics.io + Wasabi or DBGallery Cloud.
- Heavy backups (RAW + PSD): Wasabi or Backblaze B2 with Object Lock (immutability) + lifecycle rules.
- Workspace: Keep Google Drive for working documents, Airtable exports, and your active PSDs in progress, but plan to offload completed RAW/PSD sets to object storage once QC is finished.
If you want, I can plug your exact item counts and mix into a small calculator and hand you a one-pager that says: “At X items you’ll hit 200 TB on Google; here’s how much sits in DAMS vs backups; here’s what to move where.”
Here are three storage projections 📊 based on your file sizes:
- DAMS only (TIFF + JPG) → Even at 200,000 items, you’re well under the 200 TB cap.
- Backups only (RAW + PSD) → This grows much faster, getting close to the 200 TB cap at around 150,000–200,000 items.
- Combined total (DAMS + Backups) → At the higher estimates, you’ll exceed 200 TB before hitting 200,000 items.
👉 This means Google Workspace’s pooled 200 TB could work for DAMS-only storage, but not if you also expect to keep all RAWs + PSDs there long-term. Would you like me to model what happens if you offload backups to Wasabi/Glacier and keep only DAMS derivatives (TIFF/JPG) in Google?
So I should consider storage backup separate from DAMS
I built the models and tables you asked for and displayed them as a downloadable table. Summary and interpretation below.
What I modelled
- Your curator’s rule: no digital positives for the backs of negatives. Negatives still get both front + back scanned as TIFFs, but we only create a positive TIFF (and JPG) for the front.
- Prints are treated as front-only in the DAMS (backs rarely scanned).
- File-size assumptions you gave (midpoints used where ranges
existed):
- RAW: 3 × 150 MB per view = 450 MB per view
- PSD: 400 MB per view
- TIFF: 140 MB
- JPG: 6 MB
- For negatives: front and back scanned -> backups include RAW & PSD for both views; DAMS includes front negative TIFF, back negative TIFF, front positive TIFF, and a front JPG.
- For prints and ephemera: single view (front only).
- Mix scenarios modeled: Conservative (60% negatives / 35% prints / 5% ephemera), AggressiveNegatives (70/25/5), and PrintHeavy (50/45/5). I ran counts for 75k, 100k, 200k items (table shows first two mixes with 75k/100k/200k rows; aggressive includes sample rows).
Table (key columns shown)
You can download the full table from the Python output if you want. Columns include:
- Mix, Items, Counts (Negatives, Prints, Ephemera)
- DAMS_TB: TB needed for TIFF + JPG derivatives stored in DAMS
- Backups_TB: TB needed for RAW + PSD backups (working masters)
- Combined_TB: total TB
- Cost_All_Hot_$yr: Annual cost if everything (DAMS + backups) stored hot on Wasabi at $72/TB/yr
- Cost_All_Cold_$yr: Annual cost if DAMS on Wasabi and backups moved to Glacier at $12/TB/yr
- Cost_Hybrid20%Hot_$yr: Hybrid policy where 20% of backups remain hot (Wasabi) and remainder cold (Glacier)
Quick takeaways (from the table)
- For 100,000 items, Conservative mix:
- DAMS ~30 TB; Backups ~130 TB; Combined ~160 TB.
- Annual cost if All-Hot (Wasabi): ~$11,494/yr.
- All-Cold (DAMS hot on Wasabi + backups Glacier): ~$3,712/yr.
- Hybrid (20% hot backups): ~$5,269/yr.
- For 200,000 items, Conservative mix: Combined ~319 TB — you’ll exceed a 200 TB Google pool and should offload backups to cold storage or alternate provider.
- The hybrid approach typically gives the best balance: low cost with reasonable access for recent captures.
Next steps I can do right away
Pick one and I’ll produce it:
- Export the full table to CSV or Excel for your records and to share with your boss.
- Regenerate the table with your exact PSD size (300 vs 500) or different RAW counts (if sometimes fewer than 3).
- Produce charts (TB vs items, and annual cost vs items) for one or more mixes so you can include visuals in a presentation.
- Draft the lifecycle policy text to match the hybrid option you prefer (e.g., 12-month hot window, then cold archive with manifest and SLA).
Which of those would you like next?