Archive for April, 2006|Monthly archive page

NetNewsWire: I wish I wasn’t poor

After ranting about Shrook’s cool subscription service, I discovered that NetNewsWire integrates with your Bloglines account. Imagine: you go to work and read feeds all day (no, that’s not what you’re paid to do, but hey…). You come home and fire up NetNewsWire (because you’ve already been offline for like 30 minutes) and all of the feeds that you read during the day are already marked as read. You read some new feeds, which don’t show up in your Bloglines account because you read them in NNW. That’s sexy desktop-web integration, and I wish I had the cash to buy NetNewsWire.

Update: Vienna, an open-source reader for OS X, might support Bloglines integration in the future.

Update 2:
RSS owl may already have this kind of synchronization. I know that it will automatically update the list of feeds if I change it in Bloglines, but I’m not sure about items yet. RSS Owl is open-source and cross platform (Win/Mac/Linux). It’s not quite as pretty as NNW/Shrook/Vienna, but it has the features I want. Look here for how to synchronize with Bloglines.

Backpack

Okay, I'm trying to use Backpack again to organize my workflow. Normally I hate all organizers, but I'm willing to give Backpack another try.

What I really need is to maintain a neat set of notes. Not dates, those I can remember. Notes about

  • projects I'm working on
  • papers I'm writing
  • code I'm writing
  • questions I have
  • what's done, what isn't done, and what are the barriers to getting undone things done

So far I've tried 3 techniques to maintain these notes:

  1. I've made labels in Gmail and sent messages to myself containing this information. This isn't a bad technique, but things get cluttered. It's too bad you can't edit emails (you kind of can, using replies to yourself). Besides backpack, this has been the best solution. But it's difficult to move from label to label all the time.
  2. Paper. I jot things down on paper, which I still believe is the most flexible technological tool of our time. But you need an organization system for paper — file folders or something. I don't have enough space and I'm not willing to have a filing cabinet for temporary notes that are going to get trashed after about a month.
  3. Text files. This isn't a bad solution, either. But it's too easy to just let things slide – on my work computer, opening a folder and loading a text editor really is time consuming.

 So I'll see if Backpack can help me — and if I can stay with it. So far it seems to be working out — I have a page for projects, a page for papers/conferences and a page for timesheets. I add a new note for each client project or paper and a to do list on my front page. I also have a front-page note that says where I left off the day before. Presentation is convenient, since I can use Textile (which is like Markdown) to have headers, lists, etc. (by the way, Markdown is better). And if I wanted I could export everything to XML for backup (not that I'd be able to do anything with that data, but comforting still).

SubEthaEdit and backing up your toolkit

My Journey to Macintosh:

So the reason that I picked today to blog about this? It is no secret. MacZOT are running a blogging thing where you can get a copy for free if you blog about SubEthaEdit today.

SubEthaEdit is a cool editor. But it's $35 and as far as I can tell, it doesn't offer any column selections features, which I use a lot in statistical programming. But that's not the point. The point is that SubEthaEdit used to be free for noncommercial use, then they ended the promotion. So when I learned that I could have received a free copy, I cursed myself for expunging MacZot from my feed list. But then I remembered: "hey, I have an old copy on CD, I already have a legal, free version of that editor."

Which brings me to my point: It's a good idea to periodically back up all of the free/open-source software that you cherish, because some day it might be an expensive high-end commercial product!

Shrook: Very cool RSS

Shrook 2 – RSS and Atom for Mac OS X:

Shrook is a next-generation news reader that is not only easy to use, but offers advanced features not available to Mac users anywhere else. It supports all versions of RSS and Atom. Oh, and it’s free.

I never thought that desktop email clients were useful. But then I started managing a bunch of different Gmail accounts and found that (a) Gmail worked well with desktop clients because I could archive my pop messages and still read them when using the browser interface and (b) it is way too annoying to log in and log out of 3 or 4 different accounts. So now I use Mail.app.

And then I thought that desktop blog clients were useless, until I started managing a bunch of different WordPress blogs and got tired of logging in and out to post. So I bought ecto, which makes it a lot easier to create and manage meaningful (and meaningless…) blog entries. I wouldn’t have bothered if I hadn’t found a blogging platform that I really liked that also integrated with a high-quality desktop client.

The point is this: web apps and desktop apps are complementary. Web apps like Gmail, Google Calendar, Writely, NumSum and Bloglines probably won’t replace desktop apps — they’ll just make them far more useful. For me, this point has already been proven by WordPress.com and Gmail.

Shrook is more proof of concept. I’ve been using Bloglines and Google Reader, thinking that desktop readers would only tie me to my desktop. But Shrook is a great reader that bridges that gap and offers a lot of new functions that no other aggregator, browser-based or desktop, has. It lets you synchronize your desktop client with a shrook.com account, so multiple machines running Shrook will stay synched, and you can check your feeds from a browser as well. This is a paid service ($20/year), but the client is freeware. I think that this is a great strategy.

Cool client features include:

  • iTunes-like smart groups (like smart playlists) that create feed groups based on keywords
  • An iTunes-like browsing interface, so you can view feeds one-by-one or in a “river of news” format
  • Podcasting integration: If Shrook detects a podcast, it lets you view the text in the reader, but it also adds to audio file to your iTunes library automatically
  • Flagging (just like starring in Google Reader) is very convenient as a way to save something for later without going to any great lengths
  • Easy feed searching with a search bar that mirrors Camino, iTunes, Safari and Finder
  • Mark all read: This is a must-have feature
  • Integration with ecto (Google Reader has BlogThis!, but I always wanted something that would let me send stuff to my WordPress blogs (besides the ectoize bookmarklet, which never worked with Firefox)

Moreover, the strategy of selling web integration as a service and distributing the client for free is a great business idea. The synchronization service is innovative and worth paying for (for RSS geeks, that is). And it will generate more revenue than the client would (assuming it cost, say, $30 and wouldn’t become obsolete for 2-3 years). I wonder if more application-service combinations will arise. Imagine if Microsoft made a basic version of Word available for free, but created a document-collaboration subscription service through Windows Live. This is an obvious extension of the open-source the software but charge for tech-support strategy. And increasingly, computing is about service integration — think about all of the focus on XML, RSS and ODT.

LaTeX: from beginner to TeXPert

This post introduces the LaTeX typesetting system. After digesting the information below, you’ll be able to:

  • Download and install LaTeX on your PC or Mac
  • Create basic documents using LaTeX
  • Install new LaTeX packages
  • Insert tables and figures into a LaTeX document
  • Use LaTeX’s cross-referencing, footnote and basic bibliography features
  • Insert equations into a LaTeX document

These topics cover the majority of tasks that most people need to do when writing a document. However, please note that while the LaTeX system makes it very easy to create professional-looking documents, it is both comprehensive and extensible. There are many topics that are not covered by this basic tutorial. Fortunately, LaTeX is very well documented. If you come across something that you can’t figure out how to do, ask your old friend Google for help.

What is LaTeX?

At its core, LaTeX is a typesetting system that allows authors to create highly polished documents without having to worry about formatting, page breaks, object positioning, or any other style concerns that distract authors from focusing on writing. LaTeX is pronounced “lay-tech,” as it is an extension of TeX (“tech”), the original typesetting system. You can read all about the history of TeX and LaTeX on Wikipedia.

LaTeX is used widely in a variety of professions. Mathematicians, physicists, economists, statisticians and other academics and professionals that regularly use mathematical notation in their documents often use LaTeX because of the ease with which it handles such notation. Many publishers use TeX-based systems for typesetting documents.

How does LaTeX work?

LaTeX differs from traditional word processors in two fundamental ways:

  1. Generally, LaTeX documents are written using the easy-to-learn LaTeX markup language, rather than by using a graphical interface to apply styles[1].
  2. LaTeX works with your document after you have entered your text. So unlike word processors, it can use information about the total length of your document, number of tables, etc. to find the optimal places for tables, figures, page breaks, etc.

The following is an example of a very basic LaTeX document:

\documentclass{article}
\author{Your Name}
\title{Test Document}
\begin{document}
\maketitle
This is a test document
\end{document}

With any LaTeX distribution, saving the above text as a .tex file and running LaTeX on that file would produce the following:

Testdoccropped

LaTeX is designed to create the same output on any system. So if you distributed the above text to anyone with a working LaTeX distribution, regardless of their particular system, they would get the exact same result. LaTeX outputs files in several formats, but the most popular is PDF.

Getting LaTeX

All you technically need to create LaTeX documents is a LaTeX engine — the binary files and libraries that will convert plain text tex files to polished pdf files. LaTeX can be run from the command line, so *nix and DOS aficionados will feel right at home. However, using a frontend for LaTeX can make things much easier. Most frontends are essentially text editors with functions to

  • Compile documents with LaTeX without using the command line
  • Facilitate writing in the LaTeX language (wizards for table creation, code completion, etc.)

In this document, I assume that you’ll need both a LaTeX engine and a frontend. There are many engines and frontends to choose from on every operating system. I’m going to describe how to install the most popular (and easy to install) open-source tools for OS X and Windows. The only difference between using the distributions that I describe and others is configuration and practical difference between applications, so feel free to try out other distributions.

On Mac OS X

Engine. gwTeX is a free and open-source LaTeX distribution for OS X that comes with a graphical installer. To install, you download the i-Installer application, select a mirror, then select the TeX package. Additional installation instructions are available on the download page. Once installation is complete, all you need is a frontend.

Frontend. TeXShop is a very popular LaTeX frontend for OS X. Installation requires a simple drag and drop to the /Applications folder. TeXShop is automatically configured to work with gwTeX, so if that’s the engine that you’re using, you’re set.

To test out your distribution, try saving the sample document above as a .tex file and running LaTeX on your document by pressing command-t. If everything is configured properly, a window will appear similar to the example output above, and a new PDF file (as well as a log file) will appear in the directory where your file is saved.

On Windows

Engine. MikTeX is a popular open-source distribution. To install, visit this page, download the executable, and follow the dialog. Additional installation instructions are on the download page.

Frontend. TeXnic Center is an open-source frontend with many helpful features. Installation is standard, just download and open the executable, which opens a wizard.

TeXnic center is automatically configured to work with MikTeX. To test out your setup, save the sample document above as a .tex file using TeXnic Center and select Build > Current file. If everything is set up properly, a new PDF file (along with a log file) will be created in the directory where your document is saved.

On Linux

Linux systems have their own application management utilities (apt-get or rpm, for example), and installation will depend on your particular Linux distribution. Ubuntu users can use the Synaptic Package Manager. Kile is a popular and easy-to-use frontend that works with both KDE and Gnome.

A note about file types

LaTeX can make several types of output files, including PDF and DVI (device independent) files. The type of output depends on whether PDFLaTeX is used to process the file, or another program. The default for the frontends defined above is to create PDF files, but be aware that changing these settings might affect the type of output created.

LaTeX basics

LaTeX commands

LaTeX commands generally begin with a backslash and take the form \command[options]{argument}. For example,

\section{Introduction}

would define a new section, named “Introduction.” The “%” character defines a comment, and everything from that character to the end of the line is commented out and will be ignored by LaTeX. To insert the “%” character into a document, escape it with a backslash: \%.

Quotes work a bit differently in LaTeX. To insert quote marks, use the form “text”. That is, the ` character (top left of the keyboard) twice, followed by the single quote character, ‘, twice.

The preamble

Everything before the line “\begin{document}” is part of the preamble. A typical preamble might look like this:

\documentclass{article}
\usepackage{graphicx}
\title{Test}
\author{Test}
\date{}

In the example above:

  • \documentclass{article} tells LaTeX that the document is an article. Other classes include book, letter and slides
  • \usepackage{graphicx} tells LaTeX to use the graphicx package, which allows users to include many types of graphics in their documents. Packages are covered later on
  • \title{} and \author{} obviously define the title and author
  • \date{} tells LaTeX to leave the date blank. \date{April 2006} would print “April 2006″ as the date. Leaving the \date{} line out would cause LaTeX to use today’s date.

The \documentclass{} command has options. For example, \documentclass[11pt,twocolumn]{article} would organize body of the document into two columns. Note that options are separated by a comma. Other options include:

  • oneside or twoside – change margins for a one or two-sided document
  • landscape – change the document from portrait to landscape
  • titlepage or notitlepage – define whether there is a separate title page, or if the title, author and date info are presented at the top of the article

The document body

Everything after the preamble and between \begin{document} and \end{document} is part of the document body. Most of a LaTeX document is simply plain text. To start a new paragraph, insert two carriage returns (blank lines). LaTeX will ignore one blank line. To force a line break, use \\.

Document structure

A document’s structure is defined using \section{} commands. LaTeX is strongly based on well-structured documents. The structure tags include:

  • \section{Name}
  • \subsection{Name}
  • \subsubsection{Name}
  • \paragraph{Name}

To insert an unnumbered section, use the command \section*{Name}. The section numbering will continue as normal with the next section, subsection, etc.

The \paragraph{} command doesn’t need to be included unless you want to insert a heading for a paragraph. The image below shows the different structure commands in use:

Sections-1

Environments

Environments are special blocks of text. For example, the itemize and enumerate environments create bulleted and numbered lists, respectively. The following markup:

\begin{itemize}
\item First thing
\item Second thing
\item Third thing
\end{itemize}

\begin{enumerate}
\item First numbered thing
\item Second numbered thing
\end{enumerate}

Would produce a bulleted list followed by a numbered list.

Note that environments always begin with \begin{environmentname} and end with \end{environmentname}. They can be nested, so one item of a bulleted list might contain another bulleted list, or a numbered list, etc.

Other frequently used environments include:

  • Quote: \begin{quote}…\end{quote} creates a section of indented, quoted text
  • Verbatim: \begin{verbatim} … \end{verbatim} is similar to pre in HTML. In the verbatim environment, text is printed in a monospace font and special characters are ignored. Verbatim is useful for typing code tips
  • Description: \begin{description} \item[First item] text \end{description} creates a list or items with bolded names and hanging-indented text after the item name

Modifying text styles

The basic idea behind LaTeX is to absolve the author of formatting duties. Nevertheless, it’s still occasionally necessary to manually format certain text styles.

  • To insert bold text, use \textbf{text here}
  • To insert italic text, use \emph{text here}
  • To insert monospace text, use \texttt{text here} (the tt stands for teletype)
  • To use verbatim text within a sentence, use \verb|your text here|. Note that any delimiter can be used, for example \verb+your text here+ will produce the same results

Packages

Packages extend LaTeX’s functionality. Package installation essentially consists of two steps:

  1. Running LaTeX on the .ins file to produce .sty or .cls files
  2. Copying the newly created files to an appropriate directory and updating the LaTeX database

However, there are exceptions. The filetypes .sty and .cls stand for style and class, respectively. If a package does not come as a .ins file, but rather a sty or cls file, it does not need to be processed with LaTeX, and you can skip directly to step two. Also, running LaTeX on a .ins file usually produces a .dtx file. This file can be processed with LaTeX to create a manual for the package.

Note: To process a package file (ins or dtx) with LaTeX, just open that file with your frontend and process it like you would a normal tex file.

OS X. To install a new package on your Mac using gwTeX, process the files as described above, and move the sty, cls and other files to ~/Library/texmf. If this directory does not exist, create it.

Windows. The easiest way to install a package on a PC using MikTeX is to use the MikTeX package manager, which is available through the Start Menu. Just open the package manager, select a mirror, and navigate to the package that you want to install. MikTeX will take care of the rest. Another nice feature of MikTeX is that if you are processing a .tex file that requires a package that isn’t installed on your machine, it will prompt you to download it.

Next, I discuss two popular packages: graphicx and geometry. These packages are already installed with gwTeX and MikTeX, so there is no need to download and install them.

The graphicx package

The graphicx package allows you to insert images into a LaTeX document. To use it, first use the command \usepackage{graphicx} in your document preamble. Then, to insert a graphic, use the command:

\includegraphics[options]{filename.png}

graphicx supports many filetypes, including PDF, PNG and JPG. The options include:

  • width=Xin
  • height=Xin
  • scale=X (where x is between 0 and 1)

The geometry package

While formatting documents using LaTeX is easy, changing those default formats can be fairly difficult. The geometry package can make changing certain aspects of your document, including the margins, much easier. To change the margins to 1″ all around, for example, use

\usepackage[margin=1in]{geometry}

Other packages

For just about every modification that you might want to make to a standard LaTeX document, there is a premade package to help you do so. To learn more about the packages described, or to download new packages, visit the Comprehensive TeX Archive Network (CTAN).

Figures and tables

Figures and tables are LaTeX environments, however they have special attributes, such as the \caption{} command, which gives tables and figures names. They are called float elements, because their position in the final compiled document depends on LaTeX’s style algorithm.

Figures

To insert a figure, use

\begin{figure}[hbtp]
\caption{Figure name}
\begin{center}
\includegraphics{filename.pdf}
\end{center}
\label{your-reference-key}
\end{figure}

In the above markup,

  • \begin{figure} simply tells LaTeX that there is a figure environment
  • [hbtp] determines how LaTeX will place the figure (here (h), bottom (b), top(t), page(p)). LaTeX will first attempt to insert the figure at its insertion point in the tex file. If this is not possible due to space or other aesthetic considerations, it will try to place it at the bottom of the page, then at the top of the page, then on a special page reserved just for float elements. The order in which h,b,t and p are specified determines where LaTeX tries to place the float first. To force the graphic to appear in its original place, for example, you could put \begin{figure}[h], omitting b, p and t
  • \caption{Figure name} specifies the name of the figure
  • \begin{center} simply tells LaTeX to center the figure on the page. Don’t forget to end the centering environment before you end the figure environment
  • \includegraphics{…} specifies the location of the file that is being inserted as a figure
  • \label{your-reference-key} is a label that you can use to refer to the figure in the text. For example, if you label your figure “fig1″ then you can reference it later on by typing \ref{fig1}

Tables

A floated table in LaTeX consists of two environments: table, the actual floated entity in the text, and tabular, the data contained in the table. For example,

\begin{table}[hbtp]
\caption{This table is an example}
\begin{center}
\begin{tabular}{c|cc}
First row, first column & First row second column & First row, third column \\ \hline
Second row, first column & Second row, second column & Second row, third column \\
Third row, first column & Third row, second column & Third row, third column \\
\multicolumn{3}{c}{…}
\end{tabular}
\end{center}
\label{exampletable}
\end{table}

would produce

Table

Everything except the code between \begin{tabular} … \end{tabular} is the same as the figure environment described above. Here’ s how the
tabular environment works:

  • \begin{tabular}{c|cc} tells LaTeX to start a new tabular environment with three centered columns. The bar (“|”) after the first “c”, tells LaTeX that the first column has a vertical border. Using {lcrr} would create for columns, the first left aligned, the second centered, and the third and fourth right aligned
  • Table cells are separated by “&” and table rows are separated by “\\”
  • \hline creates a horizontal line
  • \multicolumn{3}{c}{Text here} creates a row that spans all three columns, is centered, and contains the text “Text here”

There are more complicated options for creating and inserting tables, but the rules above cover about 90% of all table needs.[2]

Annotations

LaTeX is capable of automatically creating important annotations, such as footnotes, cross references, tables of contents and bibliographies. Note that, since the following commands require LaTeX to automatically number text elements, LaTeX must be run on your document twice for proper display.

Footnotes

To insert a footnote, simply type \footnote{Footnote text here}. LaTeX will automatically insert the footnote number and text.[3]

Cross references

To reference a labeled Table or Figure, use \ref{your-reference-key} where “your-reference-key” is the argument to the \label{your-reference-key} command in the table or figure environments.

Table of contents

To insert a table of contents, simply put \tableofcontents at the beginning of your document. (You must run LaTeX twice to get the table of contents and references to work correctly.)
Bibliography

To create a bibliography, insert a list of the citations at the end of your document, using the form:

\begin{thebibliography}{99}
….
\bibitem{key1} Disarray, General. 2006. “\LaTeX{}: from beginner to \TeX pert.” \emph{General Disarray Blog}. Available online at \textt{http://generaldisarray.wordpress.com}. ….
\end{thebibliography}

You must manually type the bibliography entries. To refer to an item within the text, use \cite{key}. The {99} tells LaTeX that there a maximum of 99 entries in the bibliography. LaTeX needs to know this so it can correctly justify the bibliography entries with their numbering on the left.

A more efficient way to create bibliographies is to use BibTeX, which allows you to maintain a database of citations and call them as needed in your bibliography. There are also graphical tools for managing your reference databases, so you don’t have to hard code the citations, and can easily change them to different formats. However, BibTeX is too complicated to explain in this document. For an introduction, see this page.

Inserting mathematics

There are several ways to include mathematical notation in LaTeX documents. The most common are inline notation and the displaymath environment.

Inline

To include some mathematical notation within a paragraph, without offsetting from the rest of the text, enclose the notation between dollar signs. For example, $a^2+b^2=c^2$ is our favorite theorem.

Display math

The displaymath environment lets you offset some mathematical notation from the rest of the document. The code

\[
a^2+b^2=c^2
\]

would create a paragraph break and center the equation on the page.

Equation

The equation environment can be used to place numbered equations in the text. For example,

\begin{equation}
a^2+b^2=c^2
\label{pythag}
\end{equation}

would offset the equation just like the displaymath version did, but it would have a number in parenthesis on the right, and you would by able to call it in the text by typing, for example, “as we see in equation \ref{pythag}…”

Equation array

The eqnarray environment allows you to align parts of equations at the equal sign. For example,

\begin{eqnarray}
a&=&b+c\\
d&=&e+f
\end{eqnarray}

would produce

Array

Mathematical notation

There are many commands for inserting specific mathematical operators and symbols into equations. They can all be found online, and as always, use Google if you can’t figure out a specific command. The following are some common operators and commands:

Greek letters: Generally, just use the spelled-out letter. For example, \beta, \gamma and \epsilon. For upper case, use \Gamma.

Misc symbols: \leftarrow (use \Leftarrow for a double arrow), \rightarrow, etc., \leftrightarrow (<==>, if and only if), <, >, \leq (less than or equal to), \geq (greater than or equal to)

Indexing and exponents: Subscripts are denoted using the underscore (x_i) and superscripts use the “^” key (a^2). To type “i sub j comma k” you need to write “i_{j,k}” to tell LaTeX that the “j,k” comprises the entire subscript. The bracket characters are generic grouping operators in LaTeX, and they won’t appear in your document.

Some operators: \sum{1/x} or \sum_{i=1}^{\infty}{x_i}, \prod (the product), \coprod (the coproduct), \sin, \log, \max, etc.

Decorations: \hat{x}, \tilde{x} , \overline{x}, \underline{x}, \overrightarrow{x}, \overbrace{x}, \underbrace{x}, \vec{x}

Fractions: \frac{a}{b} puts a over b.

Brackets: For brackets use “(“, “[" or \lbrace and \rbrace for "{" and "}". However, if the notation that your typing is not inline, use \left( <math here> \right) or \left\lbrace <math here> \right\rbrace.

Matrices: To insert a matrix in either the display math or equation environments, use

\left[ \begin{array}{ccc}
a & b & c \\
d & e & f
\end{array}\right]

Note that the array environment is similar to the tabular environment described above. The code shown above would produce:

Matrix

For help with other symbols and operators, see this page.

For further reference

The instructions above cover many of the basic functions of LaTeX, but there are many more. A good, thorough introduction is The Not-So-Short Introduction to LaTeX (pdf).

Download

This tutorial is available as a PDF file.

Notes

[1] Although commercial implementations of LaTeX, such as Scientific Word, do offer a graphical interface, and LyX is an open-source, LaTeX-based what-you-see-is-what-you-mean typesetting system that essentially uses a graphical interface to apply LaTeX markup to text.
[2] OpenOffice users can use Calc2LaTeX to convert between Calc spreadsheets and LaTeX tables. MS Office users can try Excel2LaTeX, which does the same thing, using Excel spreadsheets. Both utilities are cross-platform.
[3] To create an “attribution” footnote, where the first footnote is marked by an asterisk, use the \thanks{text here} command.

Word. Dugg.

Wow! Hello Digg effect. Apparently my post about the proper way to use Word struck a nerve or two.

I want to talk about two things regarding this post.

1. The comments: LaTeX and OpenOffice

The comments about my post all seem to relate to OpenOffice or LaTeX being better than Word. I don't necessarily disagree, either. Many times, I have posted on my blog about how useful LaTeX is, and I'm a huge supporter of both OpenOffice and its Mac equivalent NeoOffice. 

In a perfect world, I would typeset about 90% of my documents using LaTeX and the other 10% using OpenOffice (for more graphically complicated layouts, WYSIWYG really is easier than structural markup).

But our world isn't perfect. I have to use Word for work. Although OpenOffice achieves an impressively high degree of compatibility with M$ Office, there are still annoying import errors — especially on documents with complicated tables (the type that I produce at the consulting firm where I work). I tried using OpenOffice and it caused nothing but headaches and me getting screamed at for my documents not looking right. And I do have to admit that OpenOffice is (a) a bit slow and (b) a bit of a memory hog. Hopefully the developers will continue to work on that. And nobody in my field uses LaTeX, which is a real shame.

Besides, the point of the article was not that Word is a superior choice, but that Word users should use the structured document features of their word processor instead of treating it like a typewriter. This applies to OpenOffice users as well. The fact of the matter is that most writing professionals still use Word, and I'm tired of them sending me garbled crap.

Finally, I'm surprised that nobody flamed me about Lout being better.

2. The Digg effect

[Updated: 18 April 06]

My blog went from having about 30 views/day to a max of 34,801. I know that some sites get that kind of traffic every day, but I don't, and I'm in awe of the velocity of the interweb at the moment. Traffic the next day was a little under 1/2 what it was on Digg day, with some rebound traffic from del.icio.us and Lifehacker. I expect it will gradually drop until I'm back in my own blog ghosttown, which is fine. And I noticed that my blog was at the top of the WordPress.com Hot Blogs Today list, which was scary.

Stats2Cropped.png

Right now I'm trying to come up with some other innovative articles that I could write. I thought about an intro to LaTeX, but that's been done many times over. I suspect that part of what made the Word post so popular was that almost everyone uses (or has used) word, and they're aware of all these other features that they've never used, but nobody has ever shown them how to use them or what they do. If this is true, maybe a little Excel tutorial should be in order. Any ideas? My addy is "disinterested" at the domain associated with the free email service provided by Google.

Oh yeah, a huge shoutout to the WordPress.com guys for making such a great service. I see so many pages hit the Digg frontpage then disappear forever due to the load, and that didn't happen here. Whatever you're doing, keep doing it.

Response to “On Social Bookmarking and Voting”

Recently, this post about the “danger” of social news sites like Reddit.com and Digg.com made its rounds through those same sites, even rising to the Reddit front page. There’s an irony here, which I’ll get to later, but first I want to address some of the logical flaws in the argument.

The author gets the intent of social news sites right:

The idea is that users as a collective diversity of minds can make better editors for each other than a singular, unitary mind. The appeal of this idea extends to those who want to promote some stories that would normally not get much media coverage because of the bandwagon tendencies of the media, which is a good thing.

That’s a great point. There are some stories that don’t make their way to the mainstream media. Some of them are too far to the left, some too far to the right, and some just cover topics that don’t really appeal to a mainstream audience. So, what a great idea — a site where anyone can post an article, and people get to vote on whether the story should be promoted or demoted.

But wait, this is a bait and switch! The author actually distrusts those sites.

Unfortunately, such a system is only as good as – you guessed it – the ability of users to restrain their own bandwagon tendencies. In my experience, it looks like a lot of rabid Lefties have gotten hold of these sites and automatically vote down anything that isn’t morally libertarian or Bush-hating. Rabid righties will respond in turn on these sites, of course, and it’s only a matter of time before they get their own social bookmarking site (I’ve checked out quite a few. Only one seems to have a rightward tilt so far).

Okay, a couple of issues here. First, I’m doing an informal survey of Digg and Reddit right now. Here’s what I see (excluding the article in question, of course):

  • Digg has articles about: pens, liquid oxygen, evolution, MacBooks, a potato battery, ISPs and BitTorrent traffic, Bush’s iPod having “illegal” mp3s, China’s president visiting Bill Gates, the inventor of wireless email, a world population tree map, McDonald’s recipes, photos of the bay bridge, interactive surfaces, and the ability to “spy on Iran’s nuclear facilities” with Google Earth.
  • Reddit features: Pictures taken by Rachel Papo, an article about mocking religious beliefs by Scott Adams (the Dilbert guy), a world population tree map, A discussion of the “facts that every graduating high-school student should know” piece, German schooling, Apple’s legalese sent to a little girl, movie studios and critics, the GreenPeace founder’s conversion to supporting nuclear power, an article about whether Bush is stupid or whether he is fooling Americans, the Flying Spaghetti Monster, the rewards to a life of crime, Stanislav Petrov, random birthday shooting, Blair’s refusal to back Iran strike, the war on fat, an analysis of high gas prices, the MIT magic switch, Bush-Nazi dealings, “The Ghost of Shinseki”, A gay rights and Christianity cartoon, the world’s most powerful camera, and self-employment and intellectual property.

Granted, this is not a random sample, but it is illustrative. I classified these articles into two categories (“leftist” or not), using two different definitions of leftist. The first definition is more permissive with regard to what counts as leftist. Anything that even mentions an issue that liberals tend to care about gets counted. The second is more strict — an article has to (a) imply something negative about the Bush administration, Republicans, or the US in general, (b) take a clearly liberal position on an issue such as evolution, or (c) be otherwise blatantly liberal to be counted in the “leftist” category. The data are here, along with annotations of my decisions to categorize as leftist or not. Here are the basic results:

  • Using the permissive definition, 28% of the articles could be considered liberal
  • Using the strict definition, only 14% could be considered liberal

On the other hand, there weren’t any articles that were clearly conservative, unless you count the GreenPeace founder’s conversion to supporting nuclear power, which has been supported more by conservatives than liberals (but I’ll say that the article is more about energy efficiency than nuclear politics).

What’s the point of all this? The author of the article writes that liberals vote down anything that isn’t “morally libertarian or Bush-hating.” Okay, that works, as long as you consider the use of potatoes as batteries as being morally libertarian. And I won’t bother to elucidate the difference between libertarianism and liberalism, especially as it relates to morality. The point is that only about 14% of the articles on Reddit and Digg in my little sample were obviously leftist. It’s hard to argue that liberals have taken these sites over to pursue their secular humanist agenda.

The author goes on:

My concern is with the “voting” on these sites, mainly. The voting is secret, which leads to things like this:

* People voting down anything they don’t like without giving a response as to why they don’t like it. [1]
* People arguing seriously, when they do respond, that a large number of negative votes upon an argument constitutes the defeat of an argument. [2]
* The same sorts of stories making it to the front page – usually they’re bland or things everyone knows already, because everyone agrees that the content is relevant. [3]
* Conspiracy theory posts or openly false posts not getting voted down or declared to be false in the comments getting to the front page. [4]
* Such posts like the former being used as the basis for argumentation or other posts entirely. [5]
(Numbering added)

Let’s look at each of these issues in turn:

  1. True, people don’t give a response as to why they don’t like the articles (although technically on Digg they can differentiate between inaccuracy, duplication, or “okay, this is lame). But if they did, would you read everyone’s opinion? How many people look at Digg every day? Do they all have time to give an explicit account of why they chose to digg each article or not? We don’t get a chance to give a written essay explaining why we voted for our chosen presidential candidate, so why should hold entertaining webpages to a higher standard?
  2. Nobody honestly believes that because an article about George Bush’s adequacy as commander-in-chief doesn’t make it to the top of Reddit.com that Bush has been proven incompetent. This argument is complete hyperbole.
  3. This argument has two parts: (i) The same sorts of stories making it to the front page. I totally agree, and I think this is because people don’t spend enough time digging around for good stories. (ii) The stories are bland or irrelevant, and everyone already knows the information that they contain. This contradicts the entire premise of the article. If everyone already knows everything that gets posted, then all of those leftist articles must be true, right?
  4. Conspiracy theory posts not getting voted down, but the comments say that they’re false. First, I have to say that the article to which I’m responding probably qualifies as a conspiracy theory post, so maybe we shouldn’t be so quick to bite the hand that feeds us. Second, people can vote for a conspiracy theory because they find it outrageous and, consequently, funny. See here.
  5. Guilty as charged. I’m using a conspiracy theory article as the basis for a new article. But note the hypocrisy here — in argument #1, the author claims that people don’t explain their reasons for voting an article up or down. In argument #5, he says that new articles arise because of controversy over old articles. Isn’t that a form of the explanation he seeks?

As you can see, there are some flaws in the logic of the article. Why is all of this so disturbing for the author? There is a short explanation:

It is regarding politics where these sites fail miserably, and I’m really worried when Rightist sites like these will emerge. It’s pretty clear to me that the Internet is teaching us new habits of how to conduct ourselves in a democracy, and I don’t think any of the lessons these sites offer are any good.

So, by teaching us to vote Yes/No depending on how we personally feel, internet news sites are threatening democracy? Apparently…

For what the existence of such a site seems to say is that an article is only worth what others “think” (does voting really involve thought?) of it, and that’s a scary thought. If people are allowed to make their basest biases the news with the “justification” that the MSM is no better, how can thoughtful political discussion ever happen? It would seem sites like these – if they get popular – could destroy the possibility of such discussion, even on other sites, for they empower via voting, and voting is a lot more effective than speaking.

First, nobody is saying that an article is only worth what others think. Any addition to discourse is valuable. However, how much the first, say, 500 people that read an article enjoyed it is obviously a pretty good predictor of how much the next 500 people will enjoy it, assuming that there aren’t any systematic ideological biases between readers according to the order in which they access articles. That’s the whole point of these sites. Second, what is the alternative here? Clearly, it’s sites where a select few individuals get to choose what makes it to the front page. The author already admits that this kind of selection lends itself to abuse. Third, and I can’t emphasize this enough, social news sites do not banish articles, they only promote the most popular articles to the front page. Anyone can post any opinion.

At this point in the article, I’m clearly getting the sense that the author can’t really prove that Digg and Reddit are unfair or dangerous, as he claims, but that he just dislikes a lot of the articles, and needs an outlet to express his discontent. And just when I have that thought, I read this:

Voting by itself does not make a democracy. In fact, voting is really a last resort sort of mechanism. It is speech and the ability to persuade and be persuaded that counts most.

which pretty much clears everything up. The author needed an outlet for opposition. That’s fair enough. But note two things here:

  1. The fact that this article actually made it to the front page is fairly strong evidence that speech and persuasion have their place on social news sites.
  2. Democracy: “Government by the people; especially : rule of the majority.” How do you determine who the majority is? Some kind of process by which each individual’s position can be counted. Hmm, there’s got to be a name for that…

The article ends:

What makes the social bookmarking sites dangerous is the reduction of speech to voting, or the constant trumping of speech via voting. That is not the lesson which people should be learning. They should be learning how to persuade and to compromise, to work with each other, not merely exercise strength through the forces governing popularity in a given context.

As I think I’ve shown, this process is already thoroughly implemented on Reddit, Digg and their sister sites. Persuasion takes place through the articles that are posted, and everyone has an opportunity to voice their opinion. Through the presentation of opposing viewpoints on these sites, compromises can be proposed, and a fuller understanding of each topic can arise. It’s called a dialectic. This is not dangerous — this is how all political discourse works. The vote is only the final stage — a technical aspect that is absolutely necessary to actually come to a firm decision. Voting in a presidential election is still “the reduction of speech to voting.” Even though it may be a simplistic way of conveying one’s opinion, what is the alternative? How would we select a president (or which stories should be on the front page) in a democratic way without taking a vote?

In other words, fellow Digg and Reddit readers, keep submitting articles and keep voting for the ones you like. And try to keep the political stuff (like those “leftist McDonald’s recipes”) to a minimum.

Getting away from the aggregator

I finally added a proper blogroll today. I've been avoiding it because I always had a link to my Bloglines feeds and it's always hard to decide which links to put up — if you listed every site you visit regularly, and you're like me…, your blogroll would stretch for miles.

But the reason that I did it was that I'm changing the way I use my aggregator. First, I switched from Bloglines to Google Reader. I did this for two reasons: (1) Google reader seems to have speeded up, so either they have more servers or they optimized some of that Javascript, and (2) I'm reading less feeds, and I think Bloglines is better for managing a lot of feeds and Reader is better for managing a select few.

I made this feed list decision because I spend too much time reading feeds, losing productivity by getting distracted. It's so easy to check the aggregator to see if there's something more interesting than the task at hand. By deleting the feed, I'm saying that if I want to know what's going on at, say, Reddit, I have to physically type in the URL instead of just noticing a bunch of new RSS entries. It's still really easy for me to procrastinate, but I have to actively choose to go to a new site, so there's an extra buffer between me and screwing around.

Hence the blogroll. It's a way for me to list the sites that I love, without having them hard-coded into my  hourly routine via my aggregator. Right now, the feeds I'm monitoring all meet two conditions: (1) they are high personal importance content and (2) infrequently updated. For example, I really value the posts on Freakonomics, but they don't post very frequently (maybe twice a week), so I monitor the feed. On the other hand, I find lots of great stuff through Reddit and Digg, but those sites are updated continuously, causing way to much RSS traffic.

In other words, I'm reverting in many ways to the "low tech" method of actually visiting websites that I'm interested in, even though I still think RSS is a valuable tool. But don't worry other sites — even if you don't make it into Reader, you're still a del.icio.us bookmark. 

Ten things every Microsoft Word user should know

[Update: This article has been Dugg! If you're reading this, and about to flame about how OpenOffice or LaTeX is better, please read this first. Thanks. -John]

[Update: Since you're obviously in the mood for learning, you may want to give my Excel tutorial a read: Become an Excel ninja.]

[Update: You can download this tutorial as a pdf file.]

Most people use word processors like MS Word as they would a typewriter — manually making section headers bold and centered, inserting hard breaks between paragraphs, etc. This formatting method is fine for short documents, but for long documents that include multiple sections, figures, tables and other elements that need to be styled consistently throughout the text, it pays to learn Word’s advanced features.

These features are easy-to-use, but poorly documented and, in my experience, underused — even by professionals that frequently write long documents. This tutorial presents ten tips to help you start using Word the smart way.

1. Styles

The first five tips introduce and explain the use of styles in Word. Styles are user-defined formats that control the look and feel of paragraphs, characters, tables and lists. By creating styles and assigning them to the elements of your document, you can more easily control your document’s formatting.

In Windows, the Styles pane can be turned on by clicking on the “AA” button on the Formatting toolbar:

StylesWin1.png

On the Mac, the Styles pane is one of the modules on the Formatting palette. On either OS, the pane looks something like this:

Mac1.png

The functions of the different parts of the pane are obvious. Current style of selected text shows either the style currently in use, or a summary of the formatting if the selected text doesn’t use a style. The options under Pick style to apply are either default or user-defined styles. Using the List dropdown, you can change which styles are shown in the Pick style list.

Mac2.png

Clicking on a style name applies that style to the selected text. Clicking on the Style type icon allows you to modify the style. The Modify Style dialog allows you to change every aspect of that style (font, paragraph formatting, and more, depending on the type of style selected).

Mac3.png

For example, the next two images show the same text. The first has indented paragraphs; the second has unindented paragraphs with a 12pt space between them. Rather than creating this formatting manually, the entire page was changed from the 1st to the 2nd style by editing the Normal style from Paragraph: Indentation: First Line: .5in to no paragraph indentation, but Paragraph: Spacing: After: 12pt. Note that on the second example, while it looks like there is a blank line between the paragraphs, there isn’t — the cursor automatically goes from one paragraph to another, leaving a blank space that is the same height as a line of text between.

MacIndent.png

 

MacSkip.png

Okay, that introduces styles. Now on to the good stuff.

2. Header styles and the Table of Contents

If you wanted the first-level header for your document to be Times New Roman, 16pt, bold and centered, you could easily create a new style with that formatting. However, Word has 9 built-in heading styles. They may not look the way you want them to out of the box, but you can easily change them to reflect your preferred formatting.

Why use these styles? If you use the built-in header styles, Word will recognize the structure of your document. This means that Word will know that you are typing a header, a subheader, a subsubheader, etc. It also means that, when you’re done typing your document, you can create an automatic table of contents by selecting Insert > Field > Index and Tables > TOC. You can even have Word automatically number your headings by editing the numbering styles from the Modify Styles dialog.

3. Table styles

If you produce documents that contain many tables, you’re probably familiar with the tedium of formatting table after table after table. Using table styles can make this process substantially easier.

To create a default table style, modify the Table Grid style. When you edit a table style, you can define separate styles for the header row, even and odd rows, the left and right columns, etc. So if you want the header row to be bold with a double border, and the rest of the table to be normally weighted with no border, you can edit the Table Grid style so that any table inserted into the document will automatically have that form.

MacTable.png

You can make additional modifications to the table after the general Table Grid style has been applied.

4. Character styles

Say, for example, you are creating a document that uses code examples (like HTML). You may want to set any code examples in a monotype font when referencing them within a sentence. The old-fashioned way to do this is to select Font > Courier New, type your code, then revert back to the old font. This can become tedious, and if you wanted to change the code tips to another font, you’d have to manually edit every instance of Courier New in your document.

Instead, you could create a new character style by selecting New Style from the Formatting pane and setting Style type to character. Change the font to Courier and now, anytime you want to refer to a piece of code within a paragraph, you can type the code fragment, select it, and select the style that you just created. Only the selected text will be changed.

MacCharacter.png

The difference between a character style and a paragraph style is that choosing a character style only affects the selected text, while choosing a paragraph style changes all of the text in the paragraph containing the selection to that style.

5. Line and page breaks

Have you ever inserted a table or a graphic into a Word document only to find that a new page starts right after the heading but before the table or graphic? The natural thing to do in this situation is to insert an extra carriage return or a page break right before the heading to ensure that both the heading and the table/graphic appear on the same page, right? But what happens when, later on, you’re making changes and add a new paragraph before the table heading? Now there is too much space between the table header and the paragraph that precedes it. And before you can finalize your document, you have to visually inspect and manually edit every page of your document to make sure that there are no other instances of the same problem.

There is an easy, styles-based solution to this problem. First, define a new style for table/graphic headers (or use the automatic style, described in part 6). Then, in the Modify Style dialog, select Format > Paragraph and click on the Line and Page Breaks tab. Check the box marked keep with next. Now, if Word encounters a situation where the paragraph/table/image/etc. after the header is going to start on a new page, it will make sure that the header is also on the new page. This also works for section headers.

Also, in the Line and Page Breaks tab, note the Page break before option. Selecting this option will ensure that elements of the current style will begin a new page. This is useful if, for example, you want new Heading 1 elements to always start a new page.

6. Captions and cross references

Once again, imagine that you are creating a document with many tables. The low-tech way to number your tables is to hard-code the table numbers into their headers (“Table 1: Blah…”) and refer to tables using those manually assigned numbers (“Table 1 shows that…”). But what happens if, halfway through your document, you realize that you need a new table between Table 14 and Table 15. Now you have to renumber every table after Table 15. This could take a long time if you have, say, 50 tables, and the likelihood that you’ll miss an in-text reference is fairly high.

Word has an automatic table/figure numbering feature, however. Instead of manually creating a table header with a number, you could select the entire table, right click, and choose Caption. This will open a dialog that allows you to automatically insert a “Label-Number” caption, with control over the numbering format (“Table 1.3.4,” or “Figure 5″). The caption will automatically have Word’s built Caption style, which you can edit using the standard Modify Style dialog from the Styles pane. You can also choose whether the caption will be placed above or below the table or figure.

But what about those in-text references to tables and figures. Easy. Choose Insert > Cross reference and select the table or figure that you want to refer to. Word will automatically insert the text “Table X” (or “Figure X”) where X is the automatically generated number of the table or figure. You can even have Word insert the entire caption, if desired. The cross reference feature also works for numbered items, footnotes, endnotes, etc., and can save a lot of time for writers creating long documents.

Tip: Sometimes the fields that display the automatic references become incorrect when new tables, figures, footnotes, etc. are inserted. Although the document will print correctly, the on-screen display will be off. To force Word to update all of these cross-reference fields, click on Print Preview, then close preview to view your document.

7. Turn off auto formatting

A common complaint about Word is its tendency to automatically create numbered and bulleted lists. Sometimes users want to manually make a numbered list, or begin a sentence with the “-” character without beginning a new bulleted list that uses that character. Such formatting is very easy to turn off. Select Tools > AutoCorrect > AutoFormat to turn off/on automatic lists, smart quotes, character-based formatting, fractions, etc.

8. Character-based formatting

But before you go rushing to the Tools menu to turn off all of the automatic formatting, consider taking a moment to learn how such formatting works. If you understand it, it could save you a lot of time. For example, by typing text surrounded by asterisks (*like this*), you can have Word make your text bold. Or, by surrounding text with the underscore character, you can have work make the selected text italic. When you don’t have to take your fingers off the keyboard to apply formatting changes, you can work quite a bit faster.

9. Continue previous list

The main reason why people want to turn off automatic numbered lists is to insert indented or other text between list items:

MacList.png

This can be difficult using the automatic numbered (or bulleted) list feature because Word’s default behavior is to start item number 9 after you hit return, or to begin item 8a after you hit tab. However, by creating a list using the automatic numbering function, turning the list off after the desired list item, inserting whatever needs to go before the next item, then inserting a new list, and choosing Format > Bullets and numbering:

MacContinue.png

and selecting Continue previous list, you can force the list to start at the number of the item that you left off, +1.

10. Keyboard shortcuts

Finally, a few keyboard shortcuts that might save you some time:

  • Cmd+T: Hanging indent (hit Cmd+T again to increase the hanging indent; hit Cmd+Shift+T to decrease/remove the hanging indent)
  • Cmd+=: Subscript (hit Cmd+= again to revert to the normal font)
  • Cmd+Shift+=: Superscript
  • Cmd+Shift+L: Start a bulleted list (this can also be accomplished by starting the sentence with an asterisk, assuming that this hasn’t been turned off using the AutoCorrect menu)

Note: Replace Cmd with Cntrl for Windows.

Notes

  • Most of these screenshots show Word running on Mac OS X, but the interfaces are very similar for recent versions of Word for Windows.
  • I know, this is long, but it had to be written. I’m tired of having to collaborate with other authors that are still in the word processing Stone Age. These tips take about 5 minutes to learn, and they’ll save hours over the years.
  • Word is far from the only system that offers these features. Similar functionality can be found in most word processing systems, including OpenOffice.org. If you really like logical markup and automatic formatting, consider learning LaTeX. I chose to write about word because (a) Word is the most common word-processing application and (b) it’s actually a pretty polished product.

Web services toolkit

Blogging: WordPress.com, of course. WP.com gives categories, pages, stats, easy template changes through widgets, and new features keep coming. Blogger is a close second, but it needs to be updated to reflect the evolution of blogs beyond personal diaries.

Email: Gmail. No competition.

Calendar: Google Calendar. It offers multiple calendars, subscriptions, notifications, drag and drop reorganization, one-click event adding and Gmail integration. Kiko has many of these features, but Google Calendar has better network effects – more users, integration with other services, etc.

RSS aggregator: Tie – Bloglines and Google Reader. Bloglines is really easy to use, it's fast, it offers drag-and-drop feed organization, and has special features like email subscriptions, package tracking, etc. Google Reader, while a bit slower and lacking some of Bloglines' features, has a great AJAX interface that allows you to use the keyboard to browse feeds, add labels, etc.

Next Page »