Formatting and pronunciation issues are a major stumbling block for these interfaces, though. If the device (or service behind it) can't identify "ten to the power of" when it sees 10^, I can't rely on being able to converse with my device for significant problem sets.
Solving it isn't really easy, either. When reading a document is a caret a power, a regular expression operator, an exclusive or, or a nose on a smiley? :^)
A lot of context identification work has to be sorted out for smart agents to succeed.
Just a formatting issue.