http://blog.csdn.net/eroswang/archive/2008/12/19/3560366.aspx

读《UNIX网络编程》第二版的第一卷时,发现作者在第27章“客户-服务器程序其它设计方法”中的27.6节“TCP预先派生子进程服务器程序,accept无上锁保护”中提到了一种由子进程去竞争客户端连接的设计方法,用伪码描述如下:

服务器主进程:

listen_fd = socket(...);
bind(listen_fd, ...);
listen(listen_fd, ...);
pre_fork_children(...);
close(listen_fd);
wait_children_die(...);

服务器服务子进程:

while (1) {
        conn_fd = accept(listen_fd, ...);
        do_service(conn_fd, ...);
}

初识上述代码,真有眼前一亮的感觉,也正如作者所说,以上代码确实很少见(反正我读此书之前是确实没见过)。作者真是构思精巧,巧妙地绕过了常见的预先创建子进程的多进程服务器当主服务进程接收到新的连接必须想办法将这个连接传递给服务子进程的“陷阱”,上述代码通过共享的倾听套接字,由子进程主动地去向内核“索要”连接套接字,从而避免了用UNIX域套接字传递文件描述符的“淫技”。

不过,当接着往下读的时候,作者谈到了“惊群”(Thundering herd)问题。所谓的“惊群”就是,当很多进程都阻塞在accept系统调用的时候,即使只有一个新的连接达到,内核也会唤醒所有阻塞在accept上的进程,这将给系统带来非常大的“震颤”,降低系统性能。

除了这个问题,accept还必须是原子操作。为此,作者在接下来的27.7节讲述了加了互斥锁的版本:

while (1) {
        lock(...);
        conn_fd = accept(listen_fd, ...);
        unlock(...);
        do_service(conn_fd, ...);
}

原子操作的问题算是解决了,那么“惊群”呢?文中只是提到在Solaris系统上当子进程数由75变成90后,CPU时间显著增加,并且作者认为这是因为进程过多,导致内存互换。对“惊群”问题回答地十分含糊。通过比较书中图27.2的第4列和第7列的内容,我们可以肯定“真凶”绝对不是“内存对换”。

“元凶”到底是谁?

仔细分析一下,加锁真的有助于“惊群”问题么?不错,确实在同一时间只有一个子进程在调用accept,其它子进程都阻塞在了lock语句,但是,当accept返回并unlock之后呢?unlock肯定是要唤醒阻塞在这个锁上的进程的,不过谁都没有规定是唤醒一个还是唤醒多个。所以,潜在的“惊群”问题还是存在,只不过换了个地方,换了个形式。而造成Solaris性能骤降的“罪魁祸首”很有可能就是“惊群”问题。

崩溃了!这么说所有的锁都有可能产生惊群问题了?

似乎真的是这样,所以减少锁的使用很重要。特别是在竞争比较激烈的地方。

作者在27.9节所实现的“传递文件描述符”版本的服务器就有效地克服了“惊群”问题,在现实的服务器实现中,最常用的也是此节所提到的基于“分配”形式。

把“竞争”换成“分配”是避免“惊群”问题的有效方法,但是也不要忽视“分配”的“均衡”问题,不然后果可能更加严重哦!

def f(L = [0]):
    L[0] += 1
    return L[0]

参数默认值不能是0这种传值的东西。

import os
path = os.path.dirname(amodule.__file__)
path = os.path.dirname(__file__)

search = request.POST['search']得到的是u string最好转成utf-8编码search = request.POST['search'].encode(‘utf-8′)
用函数len(x)对string是字节长度,对u string是字符长度。
因为做AES-CBC加密要算字节数并补成16的整数倍,u string得到的不是字节长度而是字符长度,会有问题。

http://lobstertech.com/2009/jun/07/python_unicode_tutorial/

First off, it’s import to understand the difference between the two types of strings Python currently uses (this distinction will be going away in Python 3.0, where the current unicode type will become the only string type in Python): bytestrings, which are of type str, and Unicode strings which are of type unicode. Both inherit from a common parent, basestring, which gives you an easy way to identify things that are strings, regardless of which type of string they are.

Python bytestrings, as the name implies, correspond to a series of bytes which, in some particular encoding, represent a sequence of characters. The default most people will get is ASCII, an encoding which handles exactly 128 (English) characters, but you can create Python bytestrings in all sorts of encodings: ISO-8859-1 for a broader set of Western European characters, KOI-8 for Russian, GB-2312 for Chinese, etc., and Python bytestrings can be encoded or decoded between compatible encodings with some ease.

Unicode strings, on the other hand, correspond to sequences of Unicode characters, and — contrary to popular belief — Unicode is not an encoding; sequences of Unicode characters, by themselves, don’t translate into any particular set of bytes you could send over a network connection, for example. In order to pass Unicode text out of your programs (and, potentially, into other programs or services), you need to encode it into a series of bytes, using a “Unicode transformation format” such as UTF-8. One of the best introductions to this sort of thing for programers is Joel Spolsky’s “The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)”, which I highly recommend you read if you’re not that familiar with how Unicode works in programming languages.

Internally, Django works with Unicode strings; data coming in from HTTP requests, databases or files will be converted to unicode before you ever see it, and data going out will be converted to an appropriate encoding, so that you largely don’t have to worry about character-encoding issues. But this is something you need to be aware of and, since your application may be doing other types of input or output or working with other software, you’ll need to know how to work around things that absolutely need one particular type of string.

u string和string

>>> import sys
>>> reload(sys)

>>> sys.setdefaultencoding('utf-8')
>>> x = '多少看风景'
>>> x
'\xe5\xa4\x9a\xe5\xb0\x91\xe7\x9c\x8b\xe9\xa3\x8e\xe6\x99\xaf'
>>> len(x)
15
>>> len(x.encode('utf-8'))
15
>>> x = u'多少看风景'
>>> len(x)
5
>>> len(x.encode('utf-8'))
15

string是有编码的,u string没有编码,可以用encode函数来编码。

>>> f = open('/tmp/lawl', "wb")
>>> f.write(u'\u5b57')
Traceback (most recent call last):
  File "", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\u5b57' in position 0: ordinal not in range(128)
>>> f.close()

传u string给函数,函数会转成默认编码(sys.getdefaultencoding()),这时可能出错。smart_str的作用可能就是把u string用utf-8来编码吧。

装PyCrypto,先自己下了代码编译了一个,在PyCrypto目录下面,后来又easy_install了一个,这个package没有展开,是个.egg文件,用unzip可展开。
后来修改PyCrypto重新编译老是感觉不到变化,原来是先搜索到那个.egg文件,删掉那个文件就可以看到修改的结果了。

urls.py中(r’^media/(?P .*)$’, ‘django.views.static.serve’, {‘document_root’:settings.STATIC_PATH})是有问题的,把media改成my_media就ok了。原因~~~

b = []
b += "123" #['1', '2', '3']
b.append("123") #['1', '2', '3', '123']
str.encode('hex')
>>>i for i in range(16) #Wrong

>>>(i for i in range(16)) #Right

>>>f(i for i in range(16)) #Right

>>>f((i for i in range(16))) #...

下一页 »