Sebastian Bernhardsson et al 2009 New J. Phys. 11 123015 doi:10.1088/1367-2630/11/12/123015
Sebastian Bernhardsson1, Luis Enrique Correa da Rocha and Petter Minnhagen
Show affiliationsEvidence is presented for a systematic text-length dependence of the power-law index γ of a single book. The estimated γ values are consistent with a monotonic decrease from 2 to 1 with increasing text length. A direct connection to an extended Heap's law is explored. The infinite book limit is, as a consequence, proposed to be given by γ=1 instead of the value γ=2 expected if Zipf's law is universally applicable. In addition, we explore the idea that the systematic text-length dependence can be described by a meta book concept, which is an abstract representation reflecting the word-frequency structure of a text. According to this concept the word-frequency distribution of a text, with a certain length written by a single author, has the same characteristics as a text of the same length extracted from an imaginary complete infinite corpus written by the same author.
Issue 12 (December 2009)
Received 10 September 2009
Published 10 December 2009
Sebastian Bernhardsson et al 2009 New J. Phys. 11 123015
C G R Geddes et al 2008 J. Phys.: Conf. Ser. 125 012002
Fausto Rossi et al 1999 J. Phys.: Condens. Matter 11 5969
R Holman and Andrew J Tolley JCAP05(2008)001
L. Mersini-Houghton and R. Holman JCAP02(2009)006
Federico I. Pelupessy et al. 2007 ApJ 665 107
Judith G. Cohen et al. 2008 ApJ 672 320
A. Buzzoni et al 2009 ApJ 703 L127
Yuexing Li et al. 2007 ApJ 665 187
Masataka Fukugita and P. J. E. Peebles 2004 ApJ 616 643