Applications Google
Menu principal

Post a Comment On: Only Python

"Thwarted by lack of speed"

5 Comments -

1 – 5 of 5
Blogger Michael Watkins said...

I use reStructuredText for a content management system and found it useful to cache the output from docutils. I retain the reST content in my datastore; write out the post processed reST data as xhtml (which is also run through tidy) - and save that all in a file system cache.

Only the first hit therefore goes through docutils, although in my experience that first hit isn't terribly slow even for moderately large documents but I'm using my own servers not Google's.

Still, even on old hardware the difference is between 7 - 10 requests per second for rendering the document fresh every time, or 200 requests per second for serving up the cached html fragment.

The nice thing about the solution is I can delete the cache directory and it all gets rebuilt as the documents are accessed.

Perhaps pre-rendering and caching can work for part of your need.

1:03 AM

Blogger André Roberge said...

Michael:

Thanks for your suggestion. I won't be able to use it directly on one of Google's server as I doubt it would get around the time limitation when attempting to build the system cache. However, I might try to use it locally and perhaps create static files (instead of using a cache) and upload them to the server.

7:58 AM

Blogger Tony Arkles said...

André,

We use App Engine for a project at work, and I don't think you should discard the idea yet!

We've found that retrieving a URL can be a slow process -- pretty much independent of the size (for reasonably-sized HTML). Based on your description, this won't be happening too often: most of your content will be served to "regular users".

If you're concerned that your urlfetch and then processing is going to take too long for a single request, you can split the task up into two requests. The first would retrieve the raw HTML into the datastore, and the second would do the processing.

The time limitation is there for you to identify which tasks are heavy CPU users, so that you can optimize them. If these tasks happen infrequently (compared to the total traffic on your site), you should be fine.

9:48 AM

Blogger Joseph said...

I am a newb to Google App engine but could you cache the result of the reST processing in Google's database? The impression I got about Google App engine was that your best bet was to pre-calculate and cache everything.

4:25 PM

Anonymous Anonymous said...

I would suggest a few possibilities:

1. Process and cache when the text is submitted, rather than when it is retrieved.

2. Do the additional processing and url fetching in the browser using Ajax (I recommend the jQuery engine, but whatever you prefer really) instead of in AppEngine.

3. If you use something simpler than reStructuredText, such as Markdown, you can do the preprocessing in the browser as well, using Showdown: http://attacklab.net/showdown/

5:31 PM

Spammers: none shall pass.
You can use some HTML tags, such as <b>, <i>, <a>

Comments on this blog are restricted to team members.

You will be asked to sign in after submitting your comment.
Please prove you're not a robot