Should books include URLs? (book design)

I’m currently working hard with a great team of volunteers on the paperback edition of The Myths of Innovation.

We’ve hit on a problem that can use some more opinions.  You guys have been great in the past (See: Footnotes vs. Endnotes).

One goal we have is to make the book more useful. In the original edition I tried hard to support claims with references, and above all, provide ways for readers to continue on the paths the book opens for readers. But the web, and it’s constantly changing landscape, makes this tricky.

Since 2007 at least 15% of the URLs in the book are broken: pure 404s (Thanks to Piotr, Theo and Allison, my URL Overlords, this should definitely be fixed in the paperback) . Another bunch of pages have simply changed or been rewritten, making the reference less applicable.

In doing my research I leaned more heavily on books and journals, but in writing/footnoting the book I leaned towards web sources since the odds of someone actually bothering to type in a URL seemed much higher than them digging up an obscure journal or tome. I’d rather have references people used, then once they are just impressed by. But this seems to have had disappointing side effects.

So here are the options I’ve heard so far:

  1. Do not include ANY URLs in books. Refer to books and journals only.
  2. Refer to URLs, but simultaneously post the list online so they can be updated at any time (and mention this clearly in the book). The book may get crusty, but readers are always offered an updated list of references.
  3. Use a bit.ly type link converter – If raw URLs are not used, the links in books will never be crusty – they can always be redirected to something useful.  This is cool, but the problem is in the book will be filled with mystery references  – I can’t necessarily explain in the book what is being linked to.  Even in the 2007 edition, the footnote has value if never used just to see what the source is (What university or website is it from?) even if you don’t ever look it up.  I could include the original reference, and if it’s URL dies, redirect to a new page that at least explains what the old reference was.
  4. Is there another option?

Have you seen other approaches in books you’ve read or purchased? Or can think one up?

I’d love for the paperback edition to be as smart and useful as possible in handling this.  Let me know what you think. Thanks.

41 Responses to “Should books include URLs? (book design)”

  1. Drew @ Cook Like Your Grandmother

    #2

    Errata is a well understood concept. I’d avoid the shorteners, because if the one you chose goes out of business all your links go dead.

    A book is a snapshot in time. The reference existed at the time you published.

    Reply
  2. paurullan

    I would completely avoid the bit.ly stuff: not just the book becomes attached to the service but you can’t guess what it is from.

    I would take a three-fold aproach: combine PURL+DOI if it can be, a LaTeX-style bibliography at the end of the book and an online wiki.

    Seems future-proof to me.

    http://en.wikipedia.org/wiki/Digital_object_identifier

    Reply
  3. Basil Vandegriend

    One recent scientific paper I read used http://webcitation.org/ which is really option 3, but it appears that webcitation stores a copy of the original page so you never have to worry about the content disappearing.

    Reply
  4. Sean

    It seems that for e-books, it should be possible to dynamically update URLs as they need to change. People are (understandably) touchy about their books being modified after they’ve been purchased, but I think this is an acceptable time to do it. Really, I think people would be ok with automatic errata updates too, as long as there’s a way to know it happened and see what was there before. Give people a “revision history” and it would be fine.

    Of course, this requires technology changes that are outside of your control, so it’s not very helpful for the present problem…

    For whatever scheme you end up using, QR codes could be a nice way to let people get the links without having to type them in manually.

    Reply
  5. Divya

    Could you not post the list of links on a page on your website, and cite each link like this: http://mythofinnovation.com/list/#cite1 which will scroll the window to the link and a description of that link? This way, the book can be current and if the cited link changes, all that is required for an update is editing the page at /list/

    Reply
  6. jcopenha

    In Chris Sells’ “Windows Forms 2.0 Programming” book he put both. For instance, on page 258 he has the following footnote

    http://si.wikipedia.org/wiki/Wikipedia:Sinhala_font (http://tinysells.com/18). Thanks to Miguel…

    I like this for two reasons. I’m very reluctant to type in most URLs.. they aren’t meant for human consumption. However a tiny url that you can manage would be easy to type in and I’m more likely to actually go READ the website referenced.

    He published this book before the rise of URL shorteners which is why he did his own. But with the URL shortener you can manage to update the back-end URL if it ever goes stale.

    Reply
  7. Kristen

    Recently I solved this problem from the user side (mostly), but I’m not sure how much it will help you from the writer end. I have been working on my thesis and ran across a few references with URLs that were broken (ok, more than a few). Because of my specialty, I am quite familiar with the website http://www.archive.org... they, um, archive the web nonstop. Whenever I ran across a reference like this I just plugged in the URL at http://www.archive.org and then selected the date I wanted to see it on based on the citation… Fortunately, these URLs are stable. For example, Google’s hompage on June 15, 2007 can be seen here and this link from archive.org is stable. http://web.archive.org/web/20070615151709/http://www.google.com/

    Reply
  8. Phil Simon

    Scott

    I was going to suggest something but Divya beat me to it.

    For The Next Wave of Technologies, I have a URL in many footnotes that point to “Extras.” I’m using my site because I can control it. If John Smith from XYZ’s company is bought, then XYZ.com/smart-reference.html probably goes away.

    A single page with citations might make sense, along with the disclaimer that they might change through no fault of your own.

    Also, while I’m at it, there’s a WP plug-in for a broken link checker that I run from time to time. Same concept.

    Good luck!

    Reply
  9. Christopher Dillon

    Very timely post, Scott. I just went through this process with my second book, and used the approach suggested by “Drew @ Cook Like Your Grandmother,” treating the book as a snapshot in time.

    I made an exception for newspaper stories that were removed from the Web after 30 days. I didn’t bother to include a URL for these, because I knew it would be useless.

    I archived the disappearing sites using Zotero.org, which helped me but wouldn’t do much for readers.

    Webcitation looks like a good solution. Does anyone know how they would handle the disappearing stories mentioned above (or ones that are moved behind a pay wall) and if it’s open to self-publishers?

    Reply
  10. Marius

    Hi,

    You could also mix option 2 and 3: host all the links on your site or your book’s site and redirect them to the original article by default. It would allow you to keep a copy should an URL stop working, and have a significant URL e.g. yourbook.com/links/UniversityOrJournal/Author/Title/.

    This would becoem your book’s companion tool. And you could allow users/readers post related URLs on such a link page.

    Reply
  11. Fazal Majid

    I would include a full URL and a QR code. Avoid URL shorteners like the plague, unless it is your own.

    Reply
  12. Luis Oliveira

    Why not do 2 , 3 and take advantage of the might Google by giving the context or seach phrase that would return the link. You should be covered by this approach.

    Reply
  13. Nathan Bashaw

    There’s a startup idea somewhere behind this problem. For now, I think the best thing to do, considering the circumstances, is to just make a “references” section on your website that you can update continually. Next to each reference, there should be a “report as broken” link that people can easily click to let you know when a link rots.

    If you wanted to really spice things up, it would be cool to make the references page a somewhat valuable experience in itself. A sort of preview to the book, if you will. I can definitely see myself checking out the references section of a book before I read it, and getting excited to see how the author tied together all this seemingly unrelated information.

    Reply
  14. Harald Felgner

    Scott, from my experience it has to be a combination of 3., 4., and 5.

    3. Use a link converter or citation service – which can go out-of business – and has the drawbacks of ALL links disappearing at once then plus being unreadable as well. Pro: Consistent. In combination with a (2.) online list: Clickable.

    4. State the reference in words to make it readable: E.g. Scott Berkun Blog, as of April 26, 2010. Pro: Now the reader may trust the source.

    5. Add the URL as of the time of citing: E.g. https://scottberkun.com/blog/2010/should-books-include-urls/

    Reply
  15. cynthia

    the question i ask is, why are you sourcing the URL?

    is it because you’re adapting a book publishing style to incorporate the web as a publishing medium? (or mimicking web publishing?)

    is it to prove that an outside source did exist?

    is it to give the reader further information on a topic?

    i tend to like harald’s solution, especially “4. State the reference in words to make it readable: E.g. Scott Berkun Blog, as of April 26, 2010.”

    because the reader can google that phrase and land pretty close to your original source material in the first page of results.

    Reply
  16. Krishnan

    Another option is to include the title of the article the url refers to alongwith the url. That way, readers can type the title if the url is broken and get to other versions of the article on the web (including the search engine cached version)

    Reply
  17. Craig

    Scott, this may be difficult with your schedule, but how about putting links to your blog posts and then referring out from the blog to source material.

    Reply
  18. Robert Hoekman, Jr

    Bypass the whole issue. In “Designing the Moment,” I told readers in the beginning about a single URL to go to

    Reply
  19. James Kalbach

    Scott,

    My initial thought is to provide a unique Google search string that will get the source in the top 3-5 results, as Luis hints at. But even this will change, and doesn’t ensure that the reader will even get to the resource.

    Sometime URLs contain additional clues about the source cited, like from the domain name and even from sub-directories. So including URLs may not just be about providing access but also helping describe the source. Shortening URLs doesn’t seem like an option, really.

    Links will always break, even if you maintain a separate site for them. So including the date accessed is a good idea, even though it comes across as a little pedantic.

    Is there any way that http://www.archive.org could help? Not sure how–just a thought…

    Reply
  20. Michael Diamond

    I agree with Drew, especially that bad urls are to be expected, however I also like your idea of providing a clear, easy to find, and maintainable resource YOU (or your publisher, or, even better, the community) control in the book for people to double check resources.

    Reply
  21. Sarath

    This is the problem we face with URLs. Recently I purchased REWORK, written by 37signal founders Jason and David. In that most of the chapters are mentioned with URL to website. It’s not practical to type down the entire characters in URL. In face we can search for the title of the Link (article name) Whatsover. I dont think bit.ly is a practical solution because within months if the URLs become invalid, there’s no point in giving bit.ly. The readers must be instructed to “Google” the title rather than taking pain of entering entire URL. If it too long, we can go for a bit.ly approach I think.

    Reply
  22. Ivan Pepelnjak

    There are (at least) three issues with URLs:

    #1 They are not descriptive
    #2 Some are long and tedious to type/copy
    #3 The content tends to change or disappear over time. Even worse, a web site might go offline and the domain be taken over by one of the spammers.

    You can address #2 with an URL shortener and #3 with webcitation.org. One or the other might disappear, but you would be no worse off than before.

    If you want to make URL quoting close to perfect, you should then:

    * describe the source (as other commenters have suggested)
    * provide the original URL and optionally the shortened URL if original is took long
    * provide a webcitation.org (or equivalent) URL to ensure that the readers can access the snapshot you’ve seen when you’ve quoted it.

    Anyhow, having an online list of citations on your personal web site (don’t rely on the publisher, I’ve been burned by a major publishing house at least once) might make them lookup process easier and more convenient.

    Reply
  23. Matt

    Would it be illegal to use something like http://posterous.com/ to post the content of the site you’re linking to? As Posterous is a kind of web scrap book, you could post quoted sections in posterous and at least these would always be 404 proof (unless Posterous goes out of business). Posterous always cite where the quoted text comes from too which maybe the one thing that causes 404’s. This would only work for shorter citations. Posterous links are meaningful too.

    Reply
  24. Ricardo Patrocínio

    You can put all the URL in your (this) web-site, in a page for the book. By doing this you will have complete control over the URL’s, and can even make updates to the book online.

    Tim Ferriss did this in is “The four hour work week” book, and to make sure only people in possession of the book he had a password that was hidden in the book, something like this “What is the first word in page 185”

    Reply
  25. Robert Shepherd

    Scott, there is lots of great ideas here, the book book-dedicated site is something that has been used in the academic text-book field for years, and has also recently been used very effectively on the cheap by Stewart Brand for his new book Whole Earth Disciple (which I highly recommend as a read when you find time). Have a look at the companion site, sbnotes.com that goes with the book.

    Reply
  26. Mohit

    I agree with Drew that Errata is already a well understood concept.
    The other approach would be to setup a website (say http://www.MoI.org, for example), and have a links/footnotes page there. In the book, you can indicate the url’s as moi.org/links/1 & so on.
    Good luck!

    Reply
  27. Daeng Bo

    This is a big problem in any situation where links occur — not just books. I created a whole site with reviews of games that worked under emulation, and included links to the pages and the download. Within a year, fifty percent of those links were broken and there was no way to fix them.

    For research, you want to link to a copy of the actual page you read. For this reason, you are best linking through archive.org. For example, the front page of CNN on my birthday, 2003 is here: http://web.archive.org/web/20030129185239/http://www4.cnn.com/

    Dan

    Reply
  28. Jurgen Appelo

    I’m looking at the same problem, with the book that I’m writing.

    Option 1 is stupid, IMHO. A reference to a book that is not available anymore is just as bad as a hyperlink to a page that doesn’t exist.

    And besides, if the information _is_ available on-line somewhere it is much better than a reference to an off-line book, because people can check it out immediately. How many people are actually going to buy an entire book, just because there’s a reference to it?

    I think I’ll go for option 2.

    Reply
    1. Scott Berkun

      Jurgen:

      I know as a reader I look at footnotes to sort out for myself how credible a particular claim is. Even if I never look up whatever it is. Both URLs and book titles/authors given some sense of credibility. http://www.time.com/blahblah vs. http://www.blogger.com/blahblah informs me as a reader. Same goes for references to print journals, books, newspapers, whatever.

      I’ve never seen any data on how readers read footnotes. It’s effectively usability data on reading, but I’m not sure such a thing exists.

      Reply
  29. Matt McC

    I’d go with #2. It’s like having an on-line errata page.

    Reply
  30. Quinten Farmer

    The best solution to this that I have seen personally is what David Friedman did in his book “Law’s Order”. In sections that either had online citations, or contained content that he wanted to update continuously after the book was published, he included a small graphic in the margin that indicated there was additional content online. Readers could then visit the book’s website, find the relevant page, and explore all the additional content.

    This greatly enriched the reading experience by both keeping the book current, and incorporating additional content without the drawbacks of including outdated URL’s in the text itself.

    Reply
  31. drm

    A couple of ideas about approaching it as curated search:

    People will type key phrases into Google more readily than they will enter a URL.

    Create a specific bibliography page in your blog with links to each citation. You can update it.

    If you want to extend the knowledge/scholarship, create a wiki with all of the sources, allowing it be to annotated and extended.

    Reply
  32. Andrew Kazyrevich

    Hey Scott,

    Create a dedicated website for the book (I guess it’s something you will do anyway). Get a page there with all the links. In the book, all links will be

    [1] http://myth-of-innovation.com/some-cool-self-explanatory-name
    [2] http://myth-of-innovation.com/another-link

    they will redirect to something real.. once something is broken, you just update the redirects on the website. You could even link to google cache if something no longer exists :)

    –Andrew

    Reply
  33. Pk

    Scott, have a look at QR code. It doesn’t solve the problem of URLs going wrong, but I guess no one has control over it. But still, it’d be useful for your readers to just use the QR code to find your reference.

    Reply
  34. Krish

    Thought of something today. One possible solution is to create the URL as follows:

    https://scottberkun.com/urls/ipad_design_url

    Direct to that URL if the URL is still active
    Else, redirect to an alternative URL

    Of course, this may require some programming and maintenance on your part.

    URL’s are good when the book is available in an electronic format, so don’t get rid of them.

    Reply
  35. Leszek Cyfer

    A Andrew Kazyrevich said, a good way is to make a webpage for the book – and then put “More info online” or “Links online” footnotes whenever you send reader to the web. Readers find many different links (even shortlinks) bothersome – better give them one and then let them click-click-click from there.

    Reply
  36. Leszek Cyfer

    Oops – “As” Andrew Kazyrevich said :)

    And no additional “tails” – just simple address to the book-page – let the readers click-on from there (obviously a good page layout is a must).

    You can set up a robot to check links on the page for 404 error and to send you an email to fix it.

    Reply

Pingbacks

Leave a Reply

* Required