• site home
  • blog home
  • galleries
  • contact
  • underwater
  • the bleeding edge

the last word

Photography meets digital computer technology. Photography wins -- most of the time.

You are here: Home / The Last Word / Data scaling

Data scaling

July 19, 2013 JimK Leave a Comment

This is the kind of tech-heavy post that I’d normally put in The Bleeding Edge. I’m posting it here because I think the topic is so important to digital photographers. It starts out with a history lesson, but you can ignore that if you’re impatient. Over the next few posts I’ll be dealing with the ramifications of the inability of our disk systems performance improvements to keep up with their massive capacity gains.

I remember my first 300 MB disk. I bought it in the late 80s or very early 90s, when I was first getting into digital photography. It was a 5 ¼ inch full height unit, and weighed a ton. It came in a 2 foot by 2 foot by 2 foot box that was full of gray springy foam, with the disk itself floating in the middle. It cost more than three thousand bucks. It used an eight-bit wide SCSI interface, and I had to buy a SCSI adapter from Adaptec to run it. Still, it was a big improvement over the 40 MB disks which were the standard of the day, which were themselves a big improvement over the 10 MB disks that first shipped on the IBM PC XT.

Disks have gotten much larger since, with the current high water mark for mainstream products being 4 TB – more than ten thousand times the capacity of my $3K SCSI disk for under a tenth the price. The disks have gotten faster, too, but the speed hasn’t kept up with the size.

The original PC XT 10 MB hard disk, the Seagate ST-412, could perform a seek in an average time of 85 msec. By the end of the 1980s, Seagate had successor products that got that time under 20 msec. The average seek time of today’s 7200 rpm drives is a little under 10 msec. Since the seek time is greatly dependent on the rotational speed – on average, you have to wait half a rotation for the data to come under the head – there isn’t a lot of opportunity to drive the seek time down, although caching can ameliorate the effects for some access patterns. Indeed, the situation is getting worse. 15,000 rpm disks have been around for years, and 10,000 rpm used to be something of a server standard, but the bulk of today’s sales is at 7200 rpm or below. Indeed, in the past few years there has been a trend towards 5000ish rpm disks for “near-line” (spinning backup) use. By the way, the reason that spindle speeds stopped going up is the same reason CPU clock rates stopped increasing: heat.

“OK”, I hear you saying, “That’s true about seeks, but that’s not the key spec for photographers: what about transfer rate?” There’s been some improvement there, of course, but the peak transfer rate goes up as the square root of the areal density, where the disk capacity rises proportionally with the areal density. As you get smaller bits, you can put more data in one track, and that helps the transfer rate. You can also have more tracks, and that doesn’t do a thing for the transfer rate.

The original PC XT disk had a maximum sustained transfer rate of a little under one megabyte per second. The new Seagate desktop 4 TB disks have a sustained transfer rate of 180 MB/s. About a two-hundredfold increase in transfer rate for a 250,000-fold increase in capacity, a factor of a thousand slower than what would be necessary to move data at a pace that kept up with the amount of data.

The result? Things take a lot longer than they used to. There have been some changes in the way we use disks that haven’t helped:

  • RAID 5 decreases write speed, even as it has the potential to increase read speed
  • Network attached disk arrays offer slower transfer speeds than the native drive interfaces
  • Likewise USB, although USB 3 goes a long way to fixing this

And there have been things we have done that do help:

  • Striped arrays (RAID 0 and 10) are faster proportionally to the number of non-redundant disks involved
  • High speed serial interfaces, like Thunderbolt and the newer versions of USB

All of these changes – good and bad —  don’t make much difference when faced with a thousandfold challenge.

I had this all driven home to me recently. I had a five-bay Synology box stuffed full of 2 TB Western Digital disks and configured as RAID 5, for a total capacity of 8 TB. I was using the box for backup, and it was full. I bought a set of 4 TB Seagate drives, ripped out the WD ones, and initialized the software, picking the same RAID 5 configuration for a total of 16 TB. I named the box Daisy (sticking with the Downton Abbey theme), joined it to the domain, and started copying over one of the directories from Anna, a file server.

long time to fill up new syn box

Vice Versa said the operation would take a couple of days. Turns out it was optimistic. Here’s what it looks like now:

vv two thirds through

And that Synology array I created? It did data scrubbing for almost three days.

syn 2 days in

Want another data point? I attached a 4 TB Seagate drive to a computer via USB 3 for offsite backup. The quick format failed twice in a row, so I told Windows to do a full format. Here’s what it looked like after 19 hours:

19 hrs into a full format 4tb

It looks like it’s going to take several days.

The trends are clear; unless we do something differently, things are going to continue to get worse. What can be done? Stay tuned.

The Last Word

← Making art in isolation Leica M9 progressive underexposure examples →

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

May 2025
S M T W T F S
 123
45678910
11121314151617
18192021222324
25262728293031
« Apr    

Articles

  • About
    • Patents and papers about color
    • Who am I?
  • How to…
    • Backing up photographic images
    • How to change email providers
    • How to shoot slanted edge images for me
  • Lens screening testing
    • Equipment and Software
    • Examples
      • Bad and OK 200-600 at 600
      • Excellent 180-400 zoom
      • Fair 14-30mm zoom
      • Good 100-200 mm MF zoom
      • Good 100-400 zoom
      • Good 100mm lens on P1 P45+
      • Good 120mm MF lens
      • Good 18mm FF lens
      • Good 24-105 mm FF lens
      • Good 24-70 FF zoom
      • Good 35 mm FF lens
      • Good 35-70 MF lens
      • Good 60 mm lens on IQ3-100
      • Good 63 mm MF lens
      • Good 65 mm FF lens
      • Good 85 mm FF lens
      • Good and bad 25mm FF lenses
      • Good zoom at 24 mm
      • Marginal 18mm lens
      • Marginal 35mm FF lens
      • Mildly problematic 55 mm FF lens
      • OK 16-35mm zoom
      • OK 60mm lens on P1 P45+
      • OK Sony 600mm f/4
      • Pretty good 16-35 FF zoom
      • Pretty good 90mm FF lens
      • Problematic 400 mm FF lens
      • Tilted 20 mm f/1.8 FF lens
      • Tilted 30 mm MF lens
      • Tilted 50 mm FF lens
      • Two 15mm FF lenses
    • Found a problem – now what?
    • Goals for this test
    • Minimum target distances
      • MFT
      • APS-C
      • Full frame
      • Small medium format
    • Printable Siemens Star targets
    • Target size on sensor
      • MFT
      • APS-C
      • Full frame
      • Small medium format
    • Test instructions — postproduction
    • Test instructions — reading the images
    • Test instructions – capture
    • Theory of the test
    • What’s wrong with conventional lens screening?
  • Previsualization heresy
  • Privacy Policy
  • Recommended photographic web sites
  • Using in-camera histograms for ETTR
    • Acknowledgments
    • Why ETTR?
    • Normal in-camera histograms
    • Image processing for in-camera histograms
    • Making the in-camera histogram closely represent the raw histogram
    • Shortcuts to UniWB
    • Preparing for monitor-based UniWB
    • A one-step UniWB procedure
    • The math behind the one-step method
    • Iteration using Newton’s Method

Category List

Recent Comments

  • JimK on How Sensor Noise Scales with Exposure Time
  • Štěpán Kaňa on Calculating reach for wildlife photography
  • Štěpán Kaňa on How Sensor Noise Scales with Exposure Time
  • JimK on Calculating reach for wildlife photography
  • Geofrey on Calculating reach for wildlife photography
  • JimK on Calculating reach for wildlife photography
  • Geofrey on Calculating reach for wildlife photography
  • Javier Sanchez on The 16-Bit Fallacy: Why More Isn’t Always Better in Medium Format Cameras
  • Mike MacDonald on Your photograph looks like a painting?
  • Mike MacDonald on Your photograph looks like a painting?

Archives

Copyright © 2025 · Daily Dish Pro On Genesis Framework · WordPress · Log in

Unless otherwise noted, all images copyright Jim Kasson.