Programming time, dates, timezones, recurring events, leap seconds… Everything is pretty terrible.
The common refrain in the industry is Just use UTC! Just use UTC! And that’s correct… Sort of. But if you’re stuck building software that deals with time, there’s so much more to consider.
It’s time… To talk about time.
This is one of the best articles – experiences? – I’ve ever read. It’s funny, well-written, deeply informative, and covers everything from programming with time, to time and UI design, to time and accessibility. This is simply an amazing piece of work.
There needs to be official free online services for these things. One should able to store all dates and times as UTC, and be able to download the most up to date rules needed to convert between UTC and local time zones and display options.
But there are APIs for this, for both the OS and browser. There is no problem there I think.
Always store time as UTC and sometimes, but in reality pretty much never, you might want to know what timezone was used by the originator so you need to store that too. You almost always want to track time rather than dates because the majority of events are associated with a specific time. Only having to track dates is the exception.
How about no?
ISO8601 etc is about how you present the time and date to the user. It’s not a suitable wire format. Since we’re obviously talking frontend in this article it’s better to use a UTC timestamp (ie an integer for decimal) on the wire and display it in the timezone and localized format of the browser/user.
You should NOT use the formatted time string for sorting. WTF article?
And no, you shouldn’t use the timestamp column type, especially not with mysql since it does timezone conversions that might surprise you. Use an integer or, if you need more precision, a decimal.
No, this article isn’t very good from a programming point. Very wordy though.
Edited 2018-05-30 03:04 UTC
Question is how much you would really save in bandwidth (or what would be the reason?) by sending an integer rather than the ISO string. If you send it in JSON it gets converted to a string anyway.
100% agreed. Summary: we measure time, there are time zones that change, and some standards exists to represent time with or without the time zone. Also there are time and date pickers. The intro was OK, if a bit shallow. Some people just have too much _time_ on their hands it seems. Never mind 85% form 15% content, but UTC actually *is* a universal representation of time. Time zones are only to do with locale and end user device formatting / display / social representation / culture. They do nothing for scheduling or order of events… unless you somehow want to schedule them independently for each time zone (what?). It’s all in the browser, but this is what the article targets I suppose. Storing a timestamp with a time zone only adds the location aspect, which, granted, may be useful, but not for the timestamp itself, while yes, it enhances usability. Use a tz-based format “over the wire”? Nope. Or I have a different understanding of what “over the wire” means. I could not find any useful answers to the question of how to go about designing calendars (or date / time pickers?) in this text either, only that it be hard and some people tweeted about it, yo. If he’s going for completeness in the intro part, there should be a mention of the NTP epoch also. Damn these Web kids… if this kind of stuff is presented at conference talks then I’m glad I don’t get to sit and listen through it. Pen, neck, stab.
I’d disagree with that – I suspect it depends a lot on the kind of system you’re dealing with. Most of the software I work on, there are very few timestamps – those that exist are mostly audit records indicating when something has changed.
Far more common are pure local dates, typically indicating that some piece of data is applicable starting or ending on some particular date, etc. Technically, yes, there’s a time implied in that – usually interpreted as either the beginning or end of day – but that interpretation is a business rule. It actually varies between clients – one might want 2018-05-30 to be interpreted as being in effect from 00:00 that day, while another might want it to be read as 23:59:59. It mostly comes down to when batch processing runs anyway…
Point is, a lot of software doesn’t need to cope with having users scattered across the timezones of the world. It’s often enough just to track a timeless date, and re-intepret into a time context when needed.
This is how we ended up with Y2K.
But in all seriousness, being forward thinking and using UTC instead of local time is not even a minor hassle.
Well actually, it sometimes is. In my case, the users simply don’t think in “time” at all… it wouldn’t make sense for them to declare that a new parameter takes effect from May 31 10:00 UTC. The start date is simply “June 1”, with the understanding that this will be interpreted as the start of the day in the local timezone. It would be crazy to complicate things by introducing times and timezones into the picture…
The point is, you learn your requirements, your users, and your problem domain – then build a solution that fits. There’s a reason why any good date API strongly distinguishes between different types – dates, datetimes, datetimes with timezone, etc – and provides methods for re-interpreting one into another. All of them have their uses…
Uh, are you sure you’re thinking of the right standard? We’re talking about “ISO 8601 Data elements and interchange formats”… that is, the standard that’s entirely about wire formats, and has nothing to do with human-focused presentation at all?
Not to mention that the ISO 8601 wire-format representation of a UTC date can be pretty compact (“20180530T083000Z” at most, “2018-05-30T08:30:00Z” if you add the optional delimiters for a more human-friendly form).
It’s also self-describing to the extent that you don’t need to specify the epoch as you do with the various “[units] since [epoch]” formats that have been used over the years (POSIX timestamp isn’t the only one) or bake old bugs in the conversion function into the spec, as you do with the MS Office timestamps supported by OOXML.
If you’re arguing against that, you should first successfully argue against UTF-8 (bloated for CJK languages unless you apply compression!), HTML (create a binary format for better compaction!), and XML-based formats like ODF, OOXML, RSS, and Atom.
(Spoiler: We moved away from binary formats because a dedicated compression algorithm gives better space utilization and a text-based interchange format is easier to implement and more amenable to manual recovery if there’s a bug in the serialization stage prior to feeding it into a mature, proven compression library.)
ssokolow,
The shortcomings of binary mostly have to do with tooling rather than anything intrinsically better about text. Do you know the saying “to someone with a hammer, everything looks like a nail”? Well, this is an apt description of unix, it was largely the textual tools, parsing, text pipes, etc, that consequently lead to text-centric formats and protocols. To someone with only text processing tools, text formats may be more accessible. But that in and of itself doesn’t mean other types of tools couldn’t exist or be better. The incentive to use human text in formats and protocols is greatly influenced by tools over technical constraints.
Standardized images formats are one example. We could use text instead of binary, but the image tools are so much better that we’ve completely done away with human readable text for image formats. It’s just a complete non-goal.
For a more serious example, look at how databases have not only revolutionized our software, but have also made accessing structured data even easier than a text editor. These databases use binary formats because there’s no benefit in having humans muck around the data files since the binary database tools are easier to use than text.
It’s only thanks to tools and standards that bytes are given meaning. It is all relative to what we are using. If we had taken an alternate evolutionary path and embraced what we consider “binary” from our vantage point, we wouldn’t necessarily consider it “binary” from that path since the tools they would use might well render those bytes as human readable text. Our bytes might well be binary to them. Consider that under IBM EBCDIC, ANSI “text” files look like binary and need to be converted prior to use. The point I’m getting at is that there’s nothing inherently superior about our text tools and standards. Standard tools and formats could have evolved with more structure and it wouldn’t necessarily have been a bad thing.
Edited 2018-05-30 16:34 UTC
Perhaps, but given the tooling we do have, you might as well be arguing in favour of EBCDIC. Also, ANSI != ASCII.
That aside, I much prefer being able to salvage various kinds of corrupted output without having to grab a hex editor.
ssokolow,
We can include as much or as little redundancy and corruption detection as desired. Although I’ll admit that many operating systems suffer from the inability to automatically detect and recover from corruption. I think operating systems could learn a thing or two from P2P systems by using end to end file hashing, but it’s a topic for a different discussion
Or, you know, you could just use a real text editor that has line folding abilities like Vim or EMACS. I’ll agree that XML is a bit of a pain because of how verbose it is and the whole attributes versus tags thing, but JSON is not really all that bad other than the lack of support for comments provided that you’re not doing stupid things like having arrays where you will never have more than one item.
ahferroin7,
I do use vim personally, it’s just not as great a manipulation tool for structured data as I’d prefer. However putting my opinion aside, I’d like to use your suggestion as evidence that my underlying point is valid. In particular that having good tools makes the need to work directly with the underlying formats much less important.
Think about it, if VIM became a great tool for manipulating the structured data in YAML, XML, JSON, BSON, then we can begin to focus more on the data itself without caring much about data formats. I think there’s a bit of “text is best” group think among those of us in the unix/linux world, but there’s no reason the unix philosophy has to stop with text primitives. We’re in a kind of chicken and egg problem with many people set in their ways, but rather than remain text-centric, we could extend unix operating systems primitives to be more data-aware and unix could be better off for it.
I’m not quite sure how exactly you’re having significant issues manipulating structured data in vim. I mean, BSON is going to be a pain in any text editor unless you convert it to some other format simply because it’s tedious to have to remember the format rules and enter individual bytes by hand. XmL is nearly as bad because of how verbose it is plus the whole attribute versus tag dichotomy. JSON, though, is just fine as long as it’s properly formatted to emphasize the nesting (which is trivial to do with either a shell script or a vim script), and I’ve never had any issues manipulating YAML from any text editor except for really stupid ones that make the indentation tedious.
Now, all that aside, yes, having good tools can help, but dependent on the tools, it can also be a significant issue. Having the tools hide all the format details from you can result in difficulty understanding why something isn’t working the way you want it to. In essence, the tool can becomes a crutch, and without it you end up being essentially useless.
Take for example the story of the first CS class I was in in college, the class was taught using the Netbeans IDE, which by default does auto-completion without you having to manually confirm if you enter the shorthand correctly. When the first written test came around (which did not involve the using the IDE at all), half the class failed it because they used auto-complete shorthand in their code instead of writing out the actual function names (despite the professor telling the class that any question they did so on would immediately be marked wrong, even if the code was otherwise correct).
ahferroin7,
Haha, Cute
Don’t get me wrong, I understand the concern with being able to understand the low level details. I’m a low level programmer myself, I love the details! But if we don’t let ourselves evolve to higher abstractions, we’re ultimately just stunting our potential. For me, text files (JSON,XML,CSV,etc) are just a means to an end. They’ve gotten us a long way, but we shouldn’t hold back higher abstractions just to keep them around.
For BSON, I agree, it’s an issue of the tooling. Same for YAML, but the YAML related stuff is trivial to handle in vim (set softtabstop to on, set shiftwidth to 2, set expandtabs to on, boom, done). The stuff with XML is a problem with XML though, the format itself is unnecessarily complicated and verbose, which is a large part of why sane people aren’t using it for new code that doesn’t need backwards compatibility. As far as JSON, it’s just laziness on the part of the application that generated the data that’s the problem (there is literally no point in using a textual serialization format if you’re not going to make sure it is readable and editable by humans without a fancy text editor or special tool).
ahferroin7,
I don’t necessarily disagree with your characterization of any of these formats, but it still seems like you are thinking in terms of text editing rather than in higher level abstractions that make such details irrelevant. I know that I’m the oddball here and that the reluctance to embrace higher levels of abstraction is typical in unix where text is king, but is it that hard to understand that text editors/tools are not the best way to consume/create structured data?
Edit:
Unix did a great job making it easy to pipe data through all of it’s text processing tools. But these primitives haven’t been significantly updated in decades and to me it seems we keep using them because we have them, even though they don’t quite fit our needs. If we could re-invent these tools, but this time with an eye towards higher level abstractions and data structures, then we would have more powerful tools that make it more convenient to handle modern data processing tasks with more complex data structures than the text files of the past.
Edited 2018-06-01 13:42 UTC
There are plenty of tools out there for handling things like this. Take a look for example at http://stedolan.github.io/jq/, or all of the options you have for XSLT processors, or the growing number of tools which support output in formats other than simple textual tables.
Just because they exist though doesn’t mean people are going to use them. Using jq as a very specific example, provided the JSON is formatted properly (with new lines, though not necessarily indentation), I can do 95% of what I would need jq to do in vim just as fast with less typing.
ahferroin7,
I have yet to find a good tool that integrates all these things in a standard way. You yourself highlight the utility of VIM being able to parse various formats, and I acknowledge how incredibly useful that is. I’d like to see the same kind of integration happen with structured data abstractions rather than plain text editors.
An example may help…
For clients I frequently get data from numerous sources in numerous formats, containing products, specs, broker information, spreadsheets, etc. This is such a common thing and yet while we have tools and APIs for all of these, I wouldn’t say that any of them really work together and I end up having to do a lot of custom programming just to get data where it needs to go. This shouldn’t be so cumbersome, IMHO it should be trivial. I want to be able to open and work with all these formats in a unified way without a second thought. But our tools just aren’t there yet, and part of the reason is that we’ve focused too much on low level formats and not enough on high level abstractions.
Once we get tools that can understand high level abstractions and software standards that allow them to inter-operate (similar to unix pipes for text), that’s when we’ll get the major productivity boost that we’re not really getting with low level text formats today. I concede that this may never happen, but it would really help in bridging the world’s isolated islands of data.
I’m curious, do you agree at all that higher level abstractions could help or are you still adamant that plain text formats & editors are sufficient?
I’ve thought about building the tools that fulfill this vision myself, but honestly it would be up to people like you to use it or not. That’s the thing, if I built the platform, it wouldn’t make a difference if no one is interested in using it.
OK, I’ve been at least partly misunderstanding you here. I hadn’t realized that what you’re worried about is not so much the usability of the formats themselves, as the lack of a proper integrated format-agnostic editing environment.
On that point, I absolutely agree. I regularly use some rather complex XSLT stylesheets to convert XML files I have to deal with into a more reasonable format for editing (originally JSON, but these days YAML), and while I’ve been lucky enough to not have to deal with BSON almost at all, I would probably do similarly there, so yes, I do think such a tool would be useful, even if it’s just normalizing everything to one format.
As far as creating something to do what you’re talking about, I’d say go for it. You’re either looking at something equivalent to ODBC if you want to go for a library and framework, or pandoc if you just do an application that translates formats. Both are extremely useful pieces of software, even if they’re not very widely used (the first because of performance reasons, the second because people often don’t know about it or don’t want to deal with Haskell).
ahferroin7,
Yeah, I think I’d enjoy the project, but the last thing I need is another unpaid side project. Something like this really needs to be open source, yet none of my clients would pay for that. Things would be so much better if we didn’t have to worry about income, haha.
There’s a lot more to discuss about data abstraction, but I think it’s a good spot to end this particular thread. In my fantasy world, we could all be talking about these things at a pub over beers
Once, for a pet project of mine (putting an image in KEO message ), I stumbled on image format which does use text – a character per pixel, determining greyscale brightness.
Yes, that is a good point. I was being a bit narrowminded there. ISO8601 is usable as a wire format. I would still argue that a timestamp is better though, if for no other reason than that more languages and libraries understand timestamps. This is actually quite sad. There’s also less parsing involved with a timestamp than with the ISO8601 format.
No, I’d have to disgree with that – timestamps (i.e. units since an epoch) aren’t as universal as you think, since the first step is to agree on an epoch (year 0? 1900? 1970?), then to agree on what units your counting in (seconds? millis? nanos?). So in practice, you have quite a lot of standards to choose from.
By contrast, the ISO8601 format for date-times is understood by pretty much any language that has a concept of a date-time… and if you find one that doesn’t, writing your own is trivial.
Timestamps sucks, there is no standard for resolution of them, and if you store them in a format where 1970 is your EPOCH then good luck storing things like birthdates. When you exchange data you really want to have all parts agreeing that it is a datetime so they can work correctly with them.
Storing datetimes as numbers in databases is really a poor solution as well unless you do extremely simple things with them, good luck trying to do things like group by year, month or day of week or select by interval.
In my experience, you need to send formatted strings to the client for presentation to be able to correctly control them, unless said client can do it correctly, but in case the client is a browser that is a definitive no if you do heavily localized webapps. (Same goes for pretty printing of numbers, which is also surprisingly complicated with regards to decimal point, digit grouping, and how to show negative numbers)
Date input is a nightmare, i have yet to find a javascript date input that supports nearly as many cultures as .NET does. Take Thailand as an example, it is really “fun” trying to handle input/output related to picking dates in gregorian calendar style to satisfy the datepicker and then trying to actually show it correctly formatted in other places. (if anyone wonders, they are using the buddhist calendar which is 543 years ahead of the gregorian calendar)
I agree that you generally when possible need to retain the time part, it makes things so much easier, and is easier to ignore too much info than finding out you have too little. Dates without time is a whole new can of worms in a global application though, as you have no time to timezone convert, so deciding for a random user if today is May 30th is tricky.
And yes, It is usually the easiest solution to store dates as UTC and then convert them on input and display. It is not the only way though as most databases can handle saving local times and preserve the timezone information so it can do all the converting when you query them and output them.
Not sure how imperative it really is to treat the user to their local date format and not just use YYYY-MM-DD (as a lot of websites are actually doing).
I for one would prefer it as it is confusing to be confronted in practice with different date formats from websites of different origin.
Well YYYY-MM-DD is at least consistanet whit the wey we specify time when no dates is needed Hh:mm:SS (wel the second arw usualy dropped), I find tat eay of doing it logical, te least specific ino on the left and finer graind divisions on the right, why shuld dtaes be rerad in a seåarate way from times of day?
if I open that website, the temperature of the CPU raises with more then 20 centigrades.
I would like to read the text, do not need fancy animations that heat up my cpu for absolutely nothing.
Regards,
Jan
China already has a universal time for the entire country and it’s huge. Timezone make not sense to me. So what I would go to work at 13:00 UTC and someone on the west cost goes to work at 15:00 UTC. That’s much better than both saying they go to work at 8:00AM.
https://www.facebook.com/Abolish-Time-Zones-218240794869421/