Showing posts with label others. Show all posts
Showing posts with label others. Show all posts

Thursday, 1 July 2010

some words on document image binarization (1)

For document images, the most important is to access the text information within the images. So one important preprocessing step called binarization is to segment the text pixels from the image background. This process can also be viewed as dividing the pixels of the document into two categories: background and foreground.

Usually, to determine each pixel's label, many thresholding techniques are applied. These techniques can be divided into global(which assigns a threshold for the whole image) or local(which assigns different thresholds for different image pixels). The local thresholding method is usually a better choice, because for many degraded document images, the unique threshold doesn't exist.

Pixel intensity, image gradient/contrast are usually good features for threshold selection. And sometime the domain knowledge(such as text stroke information) will be applied to produce a better result. However, it is still quite hard to find out incorporable domain knowledge. When thinking how we human being recognize the text strokes? The intensity(usually dark in white background), the contrast(so that we can separate the text from the background) are easily to be found out. But that is not enough? How we find out that is a character even the text stroke is broken? How we separate the text stroke from both the background and the noises?  And when the text characters are mixed together, we can still recognize them. The problem is, we know that because we knows what is text, what is characters, and even we don't know french, or japanese, we can still point them out instead of taking them as noises.

What i believe is that, there should be some common sense on what is character, which can be applied to all the languages in the world.  But i don't what it is, and how to programme it.

So from my point of view, to really propose a method that can replace human, we must view it as a learning problem. I don't whether it is right or wrong~~~

Thursday, 1 April 2010

让人流口水的工作站

原文链接:25 Elegant Workstations for Your Inspiration

我从中间挑了几个很赞的图片,啥时候俺也有一台就好了

[gallery link="file"]

Monday, 25 May 2009

关于计算机的一些语录

 


原文地址 :http://kisshi.com/2009/05/25/ji-suan-ji/


英文地址: http://www.devtopics.com/101-great-computer-programming-quotes/


1、“计算机没什么用。他们只会告诉你答案。”
(巴勃罗·毕加索,画家)


2、“计算机就跟比基尼一样,省去了人们许多的胡思乱想。”
(萨姆·尤因,作家)


5、“如果汽车能赶上计算机的发展周期的话,一辆今天的劳斯莱斯仅值100美元,每加仑要跑100万英里,每年还得爆炸一次,把里面的人杀个精光。”


(Robert X. Cringely,技术作家)
计算机智能

6、“计算机总是越来越智能的。科学家告诉我们说不久它们就能跟我们对话了。(这里的“它们”,我指的是“计算机”。我怀疑科学家永远都不能跟我们对话。)”
(Dave Barry,幽默作家)


8、“计算机会不会思考这个问题就像问潜水艇会不会游泳一样。”
(Edsger W. Dijkstra,图灵奖获得者)



9、“活了一百年却只能记住30M字节是荒谬的。你知道,这比一张压缩盘还要少。人类境况正在变得日趋退化。”
(Marvin Minsky,人工智能研究的奠基人)


11、“永远不要相信一台不能扔掉一扇窗户*的计算机”
(斯蒂夫·沃兹尼亚克,苹果联合创始人)
*译者:暗指微软的wINDOWS操作系统


14、“我终于明白‘向上兼容性’是怎么回事了。这是指我们得保留所有原有错误。”
(Dennie van Tassel)


17、“每个操作系统都差不多… 我们都一样的烂。”
(微软的高级副总裁布莱恩·瓦伦蒂尼这样描述操作系统的安全状况,2003)


18、“微软出了个新版本,Windows XP,据大家说是‘有史以来最稳定的Windows’, 对我而言, 这就好像是在说芦笋是‘有史以来发音最清脆的蔬菜一样’ “



(Dave Barry)


31、“进行软件设计有两种方式。一种是让它尽量简单,明显没有不足。另一种是弄得尽量复杂,没有明显缺陷。”


(C.A.R. Hoare)


34、“软件供应商在努力尝试让他们的软件更‘易于操作’… 迄今为止,他们最好的办法就是翻出所有的老手册,然后在封面盖上‘易于操作’这几个字。”


(比尔·盖茨)



35、“有个老套的故事说有人希望他的计算机能像他的电话机一样好用。他的愿望实现了,因为我已经不知道该如何使用自己的电话了。”
(Bjarne Stroustrup,C++之父)


37、“只有两个行业把客户称为‘用户’*。”


(Edward Tufte,信息设计大师)
*译者注:一个是computer design and drug dealing
程序员


38、“程序员在跟宇宙赛跑,他们在努力开发出更大更好的傻瓜程序,而宇宙则努力培养出更大更好的白痴。到目前为止,宇宙领先。”
(Rich Cook)


39、“你们当中很多人都知道程序员的美德。当然啦,有三种:那就是懒惰、急躁以及傲慢。”
(Larry Wall,Perl发明者)


40、“程序员的问题是你无法预料他在做什么,直到为时已晚。”
(Seymour Cray,超级计算机之父)


41、“那就是这些自认为痛恨计算机的人的真实面目。他们实际上真正痛恨的是糟糕的程序员。”
(拉瑞·尼文,科幻作家)


42、“很长时间以来我一直困惑不已,为什么一些又贵又先进的东西会一点用都没有。直到我突然想起,计算机不就是一台愚蠢之至却拥有难以置信的做聪明事能力的机器嘛,而程序员不就是聪明绝顶却拥有难以置信的干蠢事的能力的人嘛。一句话,他们简直就是天生绝配。”
(比尔·布莱森,旅游文学作家)


43、“不像学学涂涂画画也能让某人成为专家级画家,计算机科学教育不会让任何人成为一名编程大师。”


(埃里克·雷蒙,开源运动领袖)



编程


48、“就算它工作不正常也别担心。如果一切正常,你早该失业了。”
(Mosher的软件工程定律)


49、“靠代码行数来衡量开发进程就好比用重量来衡量飞机制造的进度。”
(比尔·盖茨)


50、“写代码的社会地位比盗墓的高,比管理的低。”
(杰拉尔德·温伯格,软件与系统思想家)


51、“首先学习计算机科学及理论。接着形成自己编程的风格。然后把这一切都忘掉,尽管改程序就是了。”
(George Carrette,杰出软件工程师,开源推广者)


52、“先解决问题再写代码。”
(John Johnson)


54、“迭代者为人,递归者为神。”
(L. Peter Deutsch)


55、“布尔值最好的一点是,就算你错了,也顶多错了一位而已。”
(无名氏)


56、“数组的下标是从0开始好还是从1开始好呢?我的0.5的折衷方案,以我之见,没有经过适当考虑就被否决掉了。”


(Stan Kelly-Bootle)
编程语言


57、“只有两种编程语言:一种是天天挨骂的,另一种是没人用的。”
(Bjarne Stroustrup,C++之父)


58、“PHP是不合格的业余爱好者创建的,他们犯了个小恶;Perl是娴熟而堕落的专家创建的,他们犯了阴险狡诈的大恶。”


(Jon Ribbens)


59、“COBOL的使用摧残大脑;其教育应被视为刑事犯罪。”
(E.W. Dijkstra)


60、“把良好的编程风格教给那些之前曾经接触过BASIC的学生几乎是不可能的。作为可能的程序员,他们已精神残废,无重塑的可能了。”


(E. W. Dijkstra)


61、“我想微软之所以把它叫做.Net,是因为这样它就不会在Unix的目录里显示出来了。”
(Oktal)


62、“没有一种编程语言能阻止程序员写出糟糕的程序来,不管这种语言结构有多良好。”
(Larry Flon)


63、“计算机语言设计犹如在公园里漫步。我是说侏罗纪公园。”
(Larry Wall)
C/C++


64、“搞了50年的编程语言的研究,我们难道就以C++告终啦?”
(Richard A. O’Keefe)


65、“写C或者C++就像是在用一把卸掉所有安全防护装置的链锯。”
(Bob Gray)


70、“说Java好就好在运行于所有操作系统之上,就好比说肛交好就好在无论男女都行。”


(Alanna)


71、“好吧,Java也许是编程语言的好榜样。但Java应用则是应用程序的坏榜样。”
(pixadel)


72、“要是Java真的有垃圾回收的话,大部分程序在执行的时候就会把自己干掉了。”
(Robert Sewell)
开源


73、“软件就像性事:免费/自由更好。”
(Linus Torvalds)


74、“唯一对免费软件感到害怕的人,是自己的产品还要不值钱的人。”
(David Emery)
代码


75、“好代码本身就是最好的文档。”
(Steve McConnell,《代码大全》的作者)


76、“你自己的代码如果超过6个月不看,再看的时候也一样像是别人写的。”
(伊格尔森定律)


 


83、“调试难度本来就是写代码的两倍。因此,如果你写代码的时候聪明用尽,根据定义,你就没有能耐去调试它了。”
(Brian Kernighan,《C 程序设计语言》的作者之一)


84、“如果调试是除虫的过程,那么编程就一定是把臭虫放进来的过程。”
(Edsger W. Dijkstra)
质量


85、“我才不管它能不能在你的机器上运行呢!我们又没装到你的机器上!”


(Vidiu Platon,罗马尼亚的微软最佳学生合作伙伴MSP)


86、“编程就像性一样:一时犯错,终生维护。”
(Michael Sinz)


87、“有两种写出无错程序的办法;只有第三种有用。”
(Alan J. Perlis)


88、“软件质量与指针算法不可兼得。”
(Bertrand Meyer)


89、“如果麦当劳像软件公司那样运作的话,每一百个巨无霸就会有一个令你食物中毒,而他们的回应是,‘真对不起,这是一张额外附送两个的赠券。’ “
(Mark Minasi)


90、“永远要这样写代码,好像最终维护你代码的人是个狂暴的、知道你住在哪里的精神病患者。”


(Martin Golding)


91、“是人都会犯错,不过要想把事情彻底搞砸还得请电脑出马。”
(Paul Ehrlich)


92、“计算机比人类历史上的任何发明都更快速地导致你犯更多的错误–可能除了手枪和龙舌兰酒是例外。”



(Mitch Radcliffe)



Tuesday, 28 April 2009

Concept of Programming Language

1. Programming Language 就是 Syntax 和 Semantic的集合


2. 两种重要的Semantic: Denotational Semantic : 把一段语言用另外一种语言来表达它的semantic;一般“另一种语言”就是Mathematic。Operational Semantic:把语言看成是状态的转换,(state*var*value)->state


3.lambda calculus:(lambda x (f x)) 就是 f(x), x是它的parameter.



free variable 和 bind variable (lambda x (f x y))


a-equivalent. (lambda x x) = (lambda y y)


b-equivalent. (lambda x M)N = [N/x]M, 用N 把M里面的所有free的x 都进行替换。



4. Side Effect: 如果f(x,y) 函数体内不会改变x, y的值,那么就称之为没有side effect.


5. Types:



compile time checking:


run time checking:



type checking: 显式的定义type


type inference:在runtime根据参数类型,函数类型决定type


Polymorphism


Overloading


type safe: 如果一个语言允许显式的deallocation内存,那么它必然不是完全的type safe的。



6. Control:



Dynamic Scope: for exception handler, follow control link


Static Scope: for variable declarations, follow access link



7.Parameter Pass:



pass by name:


pass by value:


pass by reference



8. Tail Recursive: 对于递归函数来说,它的结果是可以直接从递归部分得到或者是不需要递归。类似于iteration的感觉。其好处在于,每次进行递归的时候,可以不需要保 存当前的环境 f(x) -> f(x,P) P 是continuation of the current state x.


9. Continuation: 我理解为一个函数,represent the remain computation of a current state of a program.所以每当一个程序的状态发生改变(variable value change, declaration...)都会有一个新的state.



callcc 函数



10.Scope:



Control Link: dynamic form


Access Link: static form



11. Total Function & Partial Function


12. L-values :Memory location; R-values :contents


13. Anonymous Function: use function as parameters


14.Object Orientation



dynamic lookup -> code depend on object and message


        different from overloading:


        overloading: compile time


       dynamic lookup: runtime


encapsulation -> you only need to know the interface


inheritance -> relationship between implement


subtyping -> relationship between interface




15. Languages:



OO: Small talk, simula, Self, C++, Java


Functional: ML, Lisp, Haskell


Others: JavaScript, PHP, Python......




导出校内日志的python代码

update: 修改了源代码的一些问题。。。增加了时间。。。


还有很多的想法。。。不过就是没有心情没有时间去好好弄。。sigh


Tuesday, 31 March 2009

Concept of Programming Language

1. Programming Language 就是 Syntax 和 Semantic的集合


2.
两种重要的Semantic: Denotational Semantic

Wednesday, 25 February 2009

Computing Similarity Transformation

 

Eigen Decomposition

The matrix decomposition of a square matrix A into so-called eigenvalues and eigenvectors is an extremely important one. This decomposition generally goes under the name "matrix diagonalization." However, this moniker is less than optimal, since the process being described is really the decomposition of a matrix into a product of three other matrices, only one of which is diagonal, and also because all other standard types of matrix decomposition use the term "decomposition" in their names, e.g., Cholesky decomposition, Hessenberg decomposition, and so on. As a result, the decomposition of a matrix into matrices composed of its eigenvectors and eigenvalues is called eigen decomposition in this work.

 

giving the amazing decomposition of A into a similarity transformation involving P and D,

 A=PDP^(-1). (12)

The fact that this decomposition is always possible for a square matrix A as long as P is a square matrix is known in this work as the eigen decomposition theorem.

Furthermore, squaring both sides of equation (12) gives

A^2=(PDP^(-1))(PDP^(-1))(13)


=PD(P^(-1)P)DP^(-1)(14)


=PD^2P^(-1).(15)

By induction, it follows that for general positive integer powers,

 A^n=PD^nP^(-1). (16)

The inverse of A is

A^(-1)=(PDP^(-1))^(-1)(17)


=PD^(-1)P^(-1),(18)

Tuesday, 2 September 2008

Linear 和 Non Linear

昨天上computer vision & pattern recognition,老师在讲linear 和 nonlinear的时候,说了一个故事

有一个人在晚上丢了钥匙,于是就在路上找。后来警察过来问他发生了什么事? 那人手一指路黑暗的角落,说,我钥匙丢在那里了。警察纳闷了,问:你的钥匙丢在那边,你为什么跑到这边路灯下找呢?  那人说,我在那边看不到,这里有路灯,我可以看的到。所以我在这边找。

Linear和Non Linear的关系,大体也是这样吧

Sunday, 15 July 2007

Lex 与 Yacc 的学习

可以到CSDN 去,有《Lex和Yacc应用方法》 《Lex和Yacc从入门到精通》等文章都不错



我的编译器就是靠这些文章才能弄出来的啊

Monday, 25 June 2007

[z]一个女程序员的男友需求说明书

http://www.yesky.com/SoftChannel/72342389024358400/20030626/1710393.shtml                                              


    前言                                                                                                               


  常听人说程序员的生活枯燥为人刻板,其实这是你不懂程序员,代码之外,这些高                                           
智商的人幽默有趣,论坛常常是他们展现才华的地方(悲哀,因为给他们展现Coder之外                                          
的才华的地方和时间太少),我在论坛上看的一篇妙贴和回贴,整理出来供大家一乐。                                           


  正文                                                                                                               


  目的:征男友一名                                                                                                   


  概述:要求身高1.76以上(因为本人身高1.70),精通C++编程(起码要比我水平高                                          
), 24岁以上因为本人>23岁&&本人<24岁),身体强壮(这样会有安全感),在北京工                                          
作(因为本人不打算到别处去),本次征友的主要原因:受不了老妈的热心,次要原因                                           
:想找一个志同道和的人。                                                                                               


  本人简介:在北京从事计算机业两年,虽然水平不高,但有志于成为一个专家,坚                                           
持认为只有从coder做起才会真正成为高手,崇拜c++高手,业余时间喜欢音乐和足球。                                           


  UseCase1:


  基本路径:                                                                                                         


   1:你是一个真诚的人,不是玩玩而已                                                                                


   2:留给我你的基本条件及基本联系方式                                                                              


   3:我认为合适会联络你                                                                                            


   4:尝试成为朋友                                                                                                  


   5:成为恋人                                                                                                      


   6:结婚                                                                                                          


  异常路径:                                                                                                         


   1:第3步我认为不合适                                                                                             


   2:不会联系你,十分抱歉,希望你会有更好的缘分!                                                                  


  以下是网友回复:


  回复1:                                                                                                            
                                                                                                                      
    项目完成后强烈要求其公布开发文档、测试文档和维护文档。                                                         


  回复2:                                                                                                            


    不合适你直接把人家GOTO到:不会联系你,十分抱歉,希望你会有更好的缘分                                           
!                                                                                                                     


  回复3:                                                                                                             


    寻男友过程一定要遵照CMM5规范来执行,争取这个项目要成为CMM5模范工程!                                             


    现在成立CMM评审小组,愿意参加的报名.....                                                                        


  回复4:                                                                                                            


    你的文档不能通过ISO2002-SW-CMM1,项目不能通过,去问问技术总监吧!                                              


  回复5:                                                                                                            


    CMM小组一至决定需求不通过,完全不能对需求方所提供资料进行分析(比如说                                          
:需求方条件,照片等),所以这个评审失败。                                                                            ...

Thursday, 17 May 2007

编译的一些基础知识


lexical analysis


token   attribute     <token,attribute>
pattern
lexemes


error recovery


input buffering


add a sentinels EOF


prefix
suffix
substring
subsequence  


union
concatenation
kleene clourse   L*
positive clourse L+


regular language


regular expressions


regular set (r)|(s) (r)(s) (r)* (r)
regular definition


non regular set


transition diagram


NFA
DFA


NFA->DFA->Min DFA
RE->NFA
RE->DFA


the phases of a compiler


lexical
syntax
semantic
intermediate code generator
code optimizer
code generator
symbol table
error handle



Sunday, 8 April 2007

Maximum Entropy Method















A deconvolution algorithm (sometimes abbreviated MEM) which functions by minimizing a smoothness function ("entropy") in an image. Maximum entropy is also called the all-poles model or autoregressive model. For images with more than a million pixels, maximum entropy is faster than the CLEAN algorithm.


MEM is commonly employed in astronomical synthesis imaging. In this application, the resolution depends on the signal-to-noise ratio, which must be specified. Therefore, resolution is image dependent and varies across the map. MEM is also biased, since the ensemble average of the estimated noise is nonzero. However, this bias is much smaller than the noise for pixels with a SNR>>1. It can yield super-resolution, which can usually be trusted to an order of magnitude in solid angle.


Two definitions of "entropy" normalized to the flux in the image are






H_1
=
sum_(k)ln((I_k)/(M_k))

(1)



H_2
=
-sum_(k)I_kln((I_k)/(M_ke)),

(2)





where M_k is a "default image" and I_k is the smoothed image. Several unnormalized entropy measures (Cornwell 1982, p. 3) are given by









H_3
=
-sumf_iln(f_i)

(3)



H_4
=
sumln(f_i)

(4)



H_5
=
-sum1/(ln(f_i))

(5)



H_6
=
-sum1/([ln(f_i)]^2)

(6)



H_7
=
sumsqrt(ln(f_i)).

(7)