diff --git a/a.html b/a.html new file mode 100644 index 0000000..240cdc3 --- /dev/null +++ b/a.html @@ -0,0 +1,1750 @@ + + + + +中国青年报 :哈尔滨水价听证何以谍影重重_财经_凤凰网 + + + + + + + + + + + + + + + + + + + +
+ + + + + +
+ + + + +<A HREF="http://sc.ifeng.com/event.ng/Type=click&FlightID=34854&AdID=34040&TargetID=224&Segments=1,18,160,183&Targets=1154,196,342,224&Values=34,46,51,85,93,100,110,204,215,228,240,263&RawValues=&Redirect=http://a947.oadz.com/link/C/947/1990/b9F8AN06McAjpgUN7SrL8Lvk5jI_/p01a/22/http://x.thinkworld.com.cn" target="_blank"><IMG SRC="http://img.ifeng.com/tres/recommend/client/lenovo/091118-thinkpad-finance-728x90.swf" WIDTH=728 HEIGHT=90 BORDER=0></A> + +  + +
+ +
+
+
+
+ + + +
+ +

中国青年报 :哈尔滨水价听证何以谍影重重

+
2009年12月11日 10:03中国新闻网 】 【打印共有评论0
+ +
+

哈尔滨市水价听证会酝酿多时、拖延多日,终于在推迟了23天后召开。不过,中青报记者却发现,在13名据说是“没有经过遴选”的消费者代表中,至少有4人的身份存疑。一名下岗职工代表被替换成了社区干部,另一名下岗职工代表的真实身份是信访局退休干部,退休职工的真实身份却是一家酒店董事长,还有一名口口声声“法律规定,水价三年一调整”的律师,律师网站上却并无其名。(《中国青年报》12月10日)

+ +

区区13人的听证代表,何以连身份都搞得疑点重重甚至谍影重重?

+ +

按照哈尔滨市物价局的说法,此次参加听证的消费者代表都是消协推荐上来的,物价局没有遴选,也不知道消协的遴选程序。而哈尔滨市消协则是面向全社会公开召集的13名消费者,至少从相关程序上看,听证代表是自愿、有能力、也是有代表性的。然而,就是这样“严格”的程序,还是被成功易人、“掺进了沙子”,从而影响了听证会的公信力。

+ +

消费者协会为什么没有认真核对听证代表的身份?难道这个一向标榜为消费者说话的民间组织分辨不清下岗职工和退休干部的区别吗?或者说退休职工退了休以后才做了酒店的董事长?可事实是,即便下岗职工再就业了,也不可能成为退休干部,那位酒店董事长根本都没到退休年龄,而哈尔滨市也绝对不会没有一位真正的律师愿意参加水价听证会。在我看来,消协立场的模糊,或许才是其推荐代表陷入罗生门的根源所在。消协就应该代表广泛的消费者,可惜的是,这些年来我们每每看到,消协的官方色彩越来越浓,距普通消费者越来越远,推荐消费者代表的时候,往往会心领神会,主动按照政府部门的意思行事。

+ +

作为主持物价听证的物价局,其实也应有责任核对听证代表的身份,而不是轻巧地把责任推到消协头上。事实上,13位听证代表的身份审核并没有那么复杂。只要相关部门愿意核对,只要公示一下就会清清楚楚。

+ +

可见,听证代表的身份并不是搞不清,而是有关部门包括自来水企业方面并不想搞清楚。所谓谍影重重,不过是表象罢了。不想搞清代表身份的根本原因在于,早在听证以前,政府部门和水企在涨价方面可能已经有了共识。涨价的冲动是刚性的、不可阻遏的。听证云者,不过是为垄断性公共产品涨价披上一层合法的民意外衣而已。其在听证代表上玩的“狸猫换太子”游戏,则是其试图主导听证会的阴谋而已。

+ +

尽管听证会乱象丛生,我还是以为应该将类似的听证制度坚持下去,完善之、规范之、约束之。以哈尔滨水价听证会为例,尽管哈尔滨物价局以及消费者协会费尽心机,置换了部分可能会发出不和谐声音的听证代表,但真正的民意还是要发出声音的。听证会现场,唯一坚决反对涨价的退休教师刘天晓代表,因为一直得不到发言机会,甚至向主持人丢了一瓶矿泉水以示抗议。我们当然不鼓励这样的过激行为,不过,这一细节却透露出反对涨价的发声是多么的艰难多么的难以争取。(胡印斌)

(来源:中国青年报 )

+ + + + +

免责声明:本文仅代表作者个人观点,与凤凰网无关。其原创性以及文中陈述文字和内容未经本站证实,对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。

+ +
敬请关注凤凰网汽车【2009广州车展】专题报道
+
欢迎订阅凤凰网财经电子杂志《股市晚报》
+
+
+
+ + + + + + + +
+
+ + +
+ + + + + + + + + + + + + + +
 共有评论0条  点击查看
 
+ +
用户名 + + 密码 + + + 注册 +
+
+ + +
+ + + +
+ + + 所有评论仅代表网友意见,凤凰网保持中立。
+ + + + +
+ +
   
+
+ + + + +
+ + + + + + + +
+
+ 作者: +    + 编辑: + robot
+
+
+
+ + +
+ + + +
+
+ + +
+
+ +
+ + +
+
+
+ + +
+ + +
+ + +
+
+ + + + + + +
+ + + + + + + + +
+
凤凰网财经
+
今日热图昨日热图
+
+
+ + + + + + + + + + + + + +
+ + + + + + +
最具影响力的明星宝宝们
最具影响力的明星宝宝们
+ + + + + + +
迪拜酋长:我鄙视失败者
迪拜酋长:我鄙视失败者
+ + + + + + +
传郭晶晶豪门梦破灭 小三是谁
传郭晶晶豪门梦破灭 小三是谁
+ + + + + + +
侯耀华拒帮侄女还钱
侯耀华拒帮侄女还钱
+ + + + + + +
三声叹息张汝京
三声叹息张汝京
+ + + + + + +
赖斯的新思考
赖斯的新思考
+
+ + + +
+ + + + + + + + + + +
+ +
+ +
+
+ +
+
+ +
+ + + + +<A HREF="http://sc.ifeng.com/event.ng/Type=click&FlightID=34533&AdID=33731&TargetID=344&Segments=1,160,280,294&Targets=1154,196,344,322&Values=34,46,51,85,93,100,110,204,230,240,245,263&RawValues=&Redirect=http://ad.cn.doubleclick.net/click;h=v2%7C3BA0%7C0%7C0%7C%2a%7Cf;218927015;0-0;0;42083383;4252-336%7C280;33824635%7C33842513%7C1;;%3fhttp://www-935.ibm.com/services/cn/cio/ciostudy/?ca=apch_sp-20091021&me=banner&met=cio&re=ifeng&s_tact=chspb004&cm_mmc=apch_sp-20091021-chspb004-_-b-_-cio-_-ifeng" target="_blank"><IMG SRC="http://img.ifeng.com/tres/recommend/client/ibm2007/091206-ibm_finance02-336x280.swf" WIDTH=336 HEIGHT=280 BORDER=0></A> + +
+ + +
+ + + +
最热万象VIP
+ + + + + + + + + + + + + + + + + + + + + + + + +
[免费视频社区] 锵锵三人行 鲁豫有约 军情观察室 更多
+
+ + +
+
+ +
+ + + + + + + + + + +
+ + + + +
+ + + + +
+ + + +
+ + + + + + + +
  + + + +
  + + + +
+ + + + +
+ + + +
+
+ + + + + + +
+ ·曾轶可绵羊音·阅兵村黑里美
+ + + + + + +
+ ·风云2加长预告·天亮了说晚安
+ + + + + + +
+ ·入狱贪官菜谱·刺陵精彩预告片
+ + + + +
+ + + + +
+ + + + + + +
+ + + +
+ + + +
+ + + +
+ + + +
+ +
+ +

 

+
+
+ +
+
+
+
+ + + + + + + + diff --git a/cream.data b/cream.data new file mode 100644 index 0000000..099faa4 --- /dev/null +++ b/cream.data @@ -0,0 +1,41 @@ +20 4 1 +270 0.97 0.0 0.0 +1 +353 0.98 0.0 0.98 +1 +100 0.92 0 0.6 +1 +426 0.98 0.98 0.97 +1 +292 0.97 0.98 0.98 +1 +276 0.97 0.98 0.0 +1 +154 0.95 0.97 0 +1 +114 0.94 0.98 0.96 +1 +0 0 0 0 +-1 +79 0 0 0 +-1 +10 0 0 0 +-1 +0 0 0.92 0.93 +-1 +0 0 0.31 0.07 +-1 +12 0.17 0 0 +-1 +32 0.42 0 0 +-1 +118 0.81 0 0 +-1 +123 0.82 0 0.31 +-1 +18 0.55 0 0.73 +-1 +34 0.33 0.23 0.82 +-1 +200 0.89 0.06 0 +-1 diff --git a/cream.net b/cream.net new file mode 100644 index 0000000..09c0f11 --- /dev/null +++ b/cream.net @@ -0,0 +1,34 @@ +FANN_FLO_2.1 +num_layers=2 +learning_rate=0.700000 +connection_rate=1.000000 +network_type=0 +learning_momentum=0.000000 +training_algorithm=2 +train_error_function=1 +train_stop_function=0 +cascade_output_change_fraction=0.010000 +quickprop_decay=-0.000100 +quickprop_mu=1.750000 +rprop_increase_factor=1.200000 +rprop_decrease_factor=0.500000 +rprop_delta_min=0.000000 +rprop_delta_max=50.000000 +rprop_delta_zero=0.100000 +cascade_output_stagnation_epochs=12 +cascade_candidate_change_fraction=0.010000 +cascade_candidate_stagnation_epochs=12 +cascade_max_out_epochs=150 +cascade_max_cand_epochs=150 +cascade_num_candidate_groups=2 +bit_fail_limit=3.49999999999999977796e-01 +cascade_candidate_limit=1.00000000000000000000e+03 +cascade_weight_multiplier=4.00000000000000022204e-01 +cascade_activation_functions_count=10 +cascade_activation_functions=3 5 7 8 10 11 14 15 16 17 +cascade_activation_steepnesses_count=4 +cascade_activation_steepnesses=2.50000000000000000000e-01 5.00000000000000000000e-01 7.50000000000000000000e-01 1.00000000000000000000e+00 +layer_sizes=5 2 +scale_included=0 +neurons (num_inputs, activation_function, activation_steepness)=(0, 0, 0.00000000000000000000e+00) (0, 0, 0.00000000000000000000e+00) (0, 0, 0.00000000000000000000e+00) (0, 0, 0.00000000000000000000e+00) (0, 0, 0.00000000000000000000e+00) (5, 6, 5.00000000000000000000e-01) (0, 6, 0.00000000000000000000e+00) +connections (connected_to_neuron, weight)=(0, 1.16192409060151324862e-01) (1, 2.55974968703720300311e+01) (2, 1.46277055544460896641e+01) (3, 3.51973527701262511869e+01) (4, -5.13725333978066203144e+01) diff --git a/creamer.py b/creamer.py new file mode 100644 index 0000000..051a9fa --- /dev/null +++ b/creamer.py @@ -0,0 +1,14 @@ +#coding:utf-8 +from urllib import urlopen +import pageparser,datamgr + +url='http://house.focus.cn/showarticle/1911/572831.html' +rawContent=datamgr.to_utf8(urlopen(url).read()) +pageparser=pageparser.CreamParser() +pageparser.feed(rawContent) +#~ print 'url: ',url +#~ print 'title: ',pageparser.spot.title +#~ print 'keywords: ',pageparser.spot.keywords +#~ print 'body data: ',pageparser.bdata +pageparser.get_cream() +#~ print pageparser.cream diff --git a/datamgr.py b/datamgr.py new file mode 100644 index 0000000..3b7e6c9 --- /dev/null +++ b/datamgr.py @@ -0,0 +1,69 @@ +#coding:utf-8 +from os import path +from types import * +import chardet +class Spot(object): + def __init__(self,url,title='',keywords='',timestamp='',literal=''): + self.url=url + self.title=title + self.keywords=keywords + self.literal=literal + self.timestamp=timestamp + self.scream=None + + def set_scream(self,scream): + self.scream=scream + #~ def __str__(self): + #~ return self.url + #~ def __eq__(self,item): + #~ return self.url==str(item).lower() + +class CaselessDict(dict): + + def __init__(self, mapping=None): + if mapping: + if type(mapping) is dict: + for k,v in d.items(): + self.__setitem__(k, v) + elif type(mapping) in (list, tuple): + d = dict(mapping) + for k,v in d.items(): + self.__setitem__(k, v) + + # super(CaselessDict, self).__init__(d) + + def __setitem__(self, name, value): + + if type(name) in StringTypes: + super(CaselessDict, self).__setitem__(name.lower(), value) + else: + super(CaselessDict, self).__setitem__(name, value) + + def __getitem__(self, name): + if type(name) in StringTypes: + return super(CaselessDict, self).__getitem__(name.lower()) + else: + return super(CaselessDict, self).__getitem__(name) + + def __copy__(self): + pass + +def to_utf8(data,sencoding=None): + if sencoding: + try: + return data.decode(sencoding).encode('utf-8') + except Exception,e: + pass + + try: + return data.decode('GBK18030').encode('utf-8') + except Exception,e: + try: + return data.decode('GBK').encode('utf-8') + except Exception,e: + try: + sencoding=chardet.detect(data)['encoding'] + return data.decode(sencoding).encode('utf-8') + except Exception,e: + return data + diff --git a/fann.py b/fann.py new file mode 100644 index 0000000..f1cc7f5 --- /dev/null +++ b/fann.py @@ -0,0 +1,32 @@ +from pyfann.libfann import neural_net,SIGMOID_SYMMETRIC_STEPWISE + +connectionRate = 1 +learningRate = 0.7 +neuronsHiddenNum = 4 + +desiredError = 0.00005 +maxIterations = 100000 +iterationsBetweenReports = 1000 +inNum=4 +outNum=1 +class NeuNet(neural_net): + def __init__(self): + neural_net.__init__(self) + #~ neural_net.create_sparse_array(self,connectionRate,(inNum,neuronsHiddenNum, outNum)) + neural_net.create_standard_array(self,(inNum,outNum)) + neural_net.set_learning_rate(self,learningRate) + neural_net.set_activation_function_output(self,SIGMOID_SYMMETRIC_STEPWISE) + + def train_on_file(self,fileName): + neural_net.train_on_file(self,fileName,maxIterations,iterationsBetweenReports,desiredError) + + #~ def +#~ ann = libfann.neural_net() +#~ ann.create_sparse_array(connection_rate, (num_input, num_neurons_hidden, num_output)) +#~ ann.set_learning_rate(learning_rate) +#~ ann.set_activation_function_output(libfann.SIGMOID_SYMMETRIC_STEPWISE) + +#~ ann.train_on_file("../../examples/xor.data", max_iterations, iterations_between_reports, desired_error) + +#~ ann.save("xor_float.net") + diff --git a/grubbs.py b/grubbs.py new file mode 100644 index 0000000..61ecc27 --- /dev/null +++ b/grubbs.py @@ -0,0 +1,81 @@ +import math +GrubbsRatio={3:[1.15,1.16], + 4:[1.46,1.49], + 5:[1.67,1.75], + 6:[1.82,1.94], + 7:[1.94,2.10], + 8:[2.03,2.22], + 9:[2.11,2.32], + 10:[2.18,2.41], + 11:[2.23,2.48], + 12:[2.28,2.55], + 13:[2.33,2.61], + 14:[2.37,2.66], + 15:[2.41,2.70], + 16:[2.44,2.75], + 17:[2.48,2.78], + 18:[2.50,2.82], + 19:[2.53,2.85], + 20:[2.56,2.88], + 21:[2.58,2.91], + 22:[2.60,2.94], + 23:[2.62,2.96], + 24:[2.64,2.99], + 25:[2.66,3.01], + #the following data were not ganranteed to be true: + 26:[2.68,3.03], + 27:[2.70,3.05], + 28:[2.72,3.07], + 29:[2.73,3.09], + 30:[2.74,3.10], + } + +def grubb_eleminate_outliers(rawList,a=0.05): + if a==0.05: + idx=0 + else: + idx=1 + count=len(rawList) + if count<=2 or count>30: + return rawList + ave=average(rawList) + variance=get_variance(rawList,ave) + newList=[] + for i in rawList: + if math.fabs((ave-i)/float(variance))1: + return math.sqrt(sum/float(num-1)) + return None + +def average(inList): + sum=0 + for i in inList: + sum+=i + num=len(inList) + if num>0: + return sum/float(num) + return None + + + + + + + + + + + + + + + diff --git a/pageparser.py b/pageparser.py new file mode 100644 index 0000000..73e0660 --- /dev/null +++ b/pageparser.py @@ -0,0 +1,228 @@ +# -- coding: utf-8 +from sgmllib import SGMLParser +import fann,grubbs + +class ParseTag(object): + """ Class representing a tag which is parsed by the HTML parser(s) """ + + def __init__(self, tag, elmlist, enabled=True ,init=False): + self.tag = tag + self.elmlist = elmlist + self.enabled = enabled + self.init = init + + def disable(self): + """ Disable parsing of this tag """ + self.enabled = False + + def enable(self): + """ Enable parsing of this tag """ + self.enabled = True + + def isEnabled(self): + """ Is this tag enabled ? """ + return self.enabled + + def __eq__(self,tag): + return self.tag.lower()==tag.lower() + + +class RecordTag(object): + + def __init__(self,tag,attrs,inme=True): + self.tag=tag + self.attrs=attrs + self.inMe=inme + self.data='' + self.parent=None + self.preSibling=None + self.nextSibling=None + self.density=0 + self.children=[] + + def calculate_density(self): + try: + total=0.0 + for key,value in self.attrs: + total+=len(key)+len(value)+1 + total+=len(self.tag)*2+5 #5=len('<>')+len('') + dataLen=len(self.data) + self.density=dataLen/float(dataLen+total) + except Exception,e: + print e + + def set_in_me(self,inme): + self.inMe=inme + + def still_in_me(self): + return self.inMe + + def add_data(self,data): + self.data+=data + + def __str__(self): + return self.tag + +class DOMTree(list): + def __init__(self): + "lastRecTag : the last closed tag" + self.lastClosedRecTag=None + self.lastOpenRecTag=None + self.curTag=None + self.omitTags=['font','br','strong','b'] + + def get_siblings(self,recTag): + if recTag: + return [tag for tag in self.get_children(recTag.parent) if tag!=recTag] + return [] + + def get_children(self,recTag): + if recTag: + return recTag.children + return [] + + def get_last_open_tag(self): + try: + idx=-1 + while not self[idx].still_in_me(): + idx-=1 + self.lastOpenRecTag=self[idx] + except IndexError: + pass + + def start_tag(self,tag,attrs): + if tag in self.omitTags: + return + self.get_last_open_tag() + self.curTag=RecordTag(tag,attrs) + try: + preTag=self[-1] + self.curTag.parent=self.lastOpenRecTag + self.lastOpenRecTag.children.append(self.curTag) + if not preTag.still_in_me(): + self.curTag.preSibling=self.lastClosedRecTag + self.lastClosedRecTag.nextSibling=self.curTag + except (AttributeError,IndexError): + pass + self.append(self.curTag) + + def get_last_closed_tag(self): + try: + idx=-1 + while not self[idx].still_in_me(): + idx-=1 + self.lastClosedRecTag=self[idx] + except IndexError: + pass + + def end_tag(self,tag): + if tag in self.omitTags: + return + self.get_last_closed_tag() + self.lastClosedRecTag.set_in_me(False) + self.lastClosedRecTag.calculate_density() + + def handle_data(self,data): + self.get_last_open_tag() + data=data.strip() + try: + #~ print ' handle data: ',data.strip(),' curTag:',self.curTag,' lastOpenTag: ',self.lastOpenRecTag # self.lastOpenRecTag,' ', self.lastClosedRecTag + if self.curTag.still_in_me(): + self.curTag.add_data(data) + else: + self.lastOpenRecTag.add_data(data) + except AttributeError: + pass + +class SimpleParser(SGMLParser): + features = [ ParseTag('a', ['href']), + ParseTag('link', ['href']), + ParseTag('body', []), + ParseTag('title',[]), + ParseTag('script',[]), + ParseTag('style',[]), + ParseTag('meta', ['CONTENT', 'content',]), + ] + def __init__(self): + self.cream='' + self.domTree=DOMTree() + self.ann=fann.NeuNet() + self.ann.create_from_file("cream.net") + + def unknown_starttag(self, tag, attrs): + if tag in self.features: + parsetag = self.features[self.features.index(tag)] + parsetag.init=True + self.domTree.start_tag(tag,attrs) + + def unknown_endtag(self, tag): + if tag in self.features: + parsetag=self.features[self.features.index(tag)] + parsetag.init=False + self.domTree.end_tag(tag) + + def handle_data(self, data): + if not self.features[self.features.index('style')].init \ + and not self.features[self.features.index('script')].init: + self.domTree.handle_data(data) + + def get_cream(self): + idx=0 + bodyIdx=0 + for rtag in self.domTree: + if rtag.tag=='body': + bodyIdx=idx + break + idx+=1 + candidates={} + pos=0 + for rtag in self.domTree[bodyIdx+1:]: + pos+=1 + if rtag.tag in ['textarea']: + continue + ownDensity=rtag.density + if rtag.preSibling: + preDensity=rtag.preSibling.density + if preDensity==0.0 and rtag.preSibling.preSibling: + preDensity=rtag.preSibling.preSibling.density + else: + preDensity=0.0 + if rtag.nextSibling: + nextDensity=rtag.nextSibling.density + if nextDensity==0.0 and rtag.nextSibling.nextSibling: + nextDensity=rtag.nextSibling.nextSibling.density + else: + nextDensity=0.0 + # Load the data we described above. + calc_out=self.ann.run([len(rtag.data),ownDensity,preDensity,nextDensity]) + if calc_out[0]>-0.7: + candidates[pos]=rtag + #~ print rtag.tag,' ',calc_out[0],' ',pos,' len:',len(rtag.data),' ',ownDensity,' ',preDensity,' ',nextDensity + #~ print "===============================" + #eleminate the tag that is far away from most of the tags + validTagKeys=grubbs.grubb_eleminate_outliers(candidates.keys()) + validTagKeys.sort() + for key in validTagKeys: + print candidates[key].tag,' ',key,' ',candidates[key].data + + def reset(self): + SGMLParser.reset(self) + +class CreamParser(SimpleParser): + """ A parser based on effbot's sgmlop """ + + def __init__(self): + # This module should be built already! + import sgmlop + self.parser = sgmlop.SGMLParser() + self.parser.register(self) + SimpleParser.__init__(self) + + def finish_starttag(self, tag, attrs): + self.unknown_starttag(tag, attrs) + + def finish_endtag(self, tag): + self.unknown_endtag(tag) + + def feed(self, data): + self.parser.feed(data) diff --git a/pytidy/__init__.py b/pytidy/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/pytidy/pytidy.py b/pytidy/pytidy.py new file mode 100644 index 0000000..3fa417b --- /dev/null +++ b/pytidy/pytidy.py @@ -0,0 +1,53 @@ +# This file was automatically generated by SWIG (http://www.swig.org). +# Version 1.3.36 +# +# Don't modify this file, modify the SWIG interface instead. +# This file is compatible with both classic and new-style classes. + +import _pytidy +import new +new_instancemethod = new.instancemethod +try: + _swig_property = property +except NameError: + pass # Python < 2.2 doesn't have 'property'. +def _swig_setattr_nondynamic(self,class_type,name,value,static=1): + if (name == "thisown"): return self.this.own(value) + if (name == "this"): + if type(value).__name__ == 'PySwigObject': + self.__dict__[name] = value + return + method = class_type.__swig_setmethods__.get(name,None) + if method: return method(self,value) + if (not static) or hasattr(self,name): + self.__dict__[name] = value + else: + raise AttributeError("You cannot add attributes to %s" % self) + +def _swig_setattr(self,class_type,name,value): + return _swig_setattr_nondynamic(self,class_type,name,value,0) + +def _swig_getattr(self,class_type,name): + if (name == "thisown"): return self.this.own() + method = class_type.__swig_getmethods__.get(name,None) + if method: return method(self) + raise AttributeError,name + +def _swig_repr(self): + try: strthis = "proxy of " + self.this.__repr__() + except: strthis = "" + return "<%s.%s; %s >" % (self.__class__.__module__, self.__class__.__name__, strthis,) + +import types +try: + _object = types.ObjectType + _newclass = 1 +except AttributeError: + class _object : pass + _newclass = 0 +del types + + +fix = _pytidy.fix + + diff --git a/test_DOMTree.py b/test_DOMTree.py new file mode 100644 index 0000000..3fef7bb --- /dev/null +++ b/test_DOMTree.py @@ -0,0 +1,36 @@ +#coding:utf-8 +import pageparser,datamgr +from pytidy import pytidy + +url='http://blog.qq.com/qzone/41533848/1260352786.htm' +spotObj=datamgr.Spot(url) + +rawContent=""" +

+

  保障性住房缘何多是非 +

+

  保障性住房是指政府为中低收入住房困难家庭所提供的限定标准、限定价格或租金的住房,由廉租住房、经济适用住房和政策性租赁住房构成。自从保障性住房推出之后,一直是非不断,以丑闻居多。 +

+

  最近,武汉经适房“六连号”、郑州“经适房建别墅事件”等案例暴露出了保障性住房制度上的漏洞,更重要的是将政府执行部门的公信度降低到了极点。许多经济适用房被不符合条件的人占有,成为一些人合法吞噬低收入者福利的一种途径。经济适用房作为一种公共福利,是政府兴建、政府分配,政府成为直接主体,经济适用房的分配不公,使许多真正的中低收入者对于购置保障性住房失去了希望,社会影响极坏。 +

+ +

  造成这样丑闻的原因主要是由于保障性住房的资源过于紧张,中低收入者庞大的需求量与紧张的房源之间不成比例,加之保障性住房的价格与市场上普通的商品房之间价格也有着较大的差异,这就使一些有着“投机思想”和“特权主义”的人费尽心思去徇私舞弊。 +

+

  保障性住房成“鸡肋”房 + +""" +pageparser=pageparser.HMSGMLOpParser(spotObj) +pageparser.feed(rawContent) +pageparser.get_cream() + +#~ print pytidy.fix("
") +for rectag in pageparser.domTree: + print rectag.tag,': ' + print ' parent: ',rectag.parent + print ' preSibling: ',rectag.preSibling + print ' nextSibling: ',rectag.nextSibling + print ' children: ',[child.tag for child in pageparser.domTree.get_children(rectag)] + print ' siblings: ',[sibling.tag for sibling in pageparser.domTree.get_siblings(rectag)] + print ' data: ',rectag.data,len(rectag.data) + print ' density: ',rectag.density + diff --git a/test_fann.py b/test_fann.py new file mode 100644 index 0000000..f635126 --- /dev/null +++ b/test_fann.py @@ -0,0 +1,26 @@ +import fann +from pyfann import libfann + +ann=fann.NeuNet() +ann.train_on_file("cream.data") +ann.save("cream.net") + +#~ ann=fann.NeuNet() +#~ ann.create_from_file("cream.net") + +def test(l,res): + print "%s should be %s"%(ann.run(l),res) + +test([350,0.83,0.8],True) +test([114,0.94,0.98,0.96],True) +test([38,0.7,0,0.0],False) +test([32, 0.42, 0, 0],-1) +test_data = libfann.training_data() +test_data.read_train_from_file("cream.data") +ann.test_data(test_data) +print "MSE error on test data: %f" % ann.get_MSE() + +#~ calc_out=ann.run([350,0.83,0.8]) +#~ print calc_out,' should be: ','True' +#~ print ann.run([114,0.94,0.98,0.96]),[114,0.94,0.98,0.96] +#~ print ann.run([38,0.7,0,0.0]),'should be: False'