Alessio Caiazza is sharing code with you

Bitbucket is a code hosting site. Unlimited public and private repositories. Free for small teams.

Don't show this again

nolith / ipv6 - fine del mondo http://slideshare.net/nolith/ipv6-e-la-fine-del-mondo

Slide su IPv6 per la notte blù

Clone this repository (size: 9.7 MB): HTTPS / SSH
hg clone https://bitbucket.org/nolith/ipv6-fine-del-mondo
hg clone ssh://hg@bitbucket.org/nolith/ipv6-fine-del-mondo

ipv6 - fine del mondo / latex_task.rb

Branch
default
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
=begin rdoc
==Introduction
This file contains a rake task which can be used to process latex files to produce
dvi, pdf or other kind of files.

To create a latex task, simply call <tt>latex_file</tt> with the name of the output
file, and, if the default options satisfy your needs, you're done. If you need to
customize some parameter, you can pass a block to <tt>latex_file</tt> and the block
will be called with the _runner_ corresponding to the task as argument. A runner
is an object of class +LaTeXRunner+, which encapsulates all the information 
needed to process the tex file. In the block, you can change its attributes, 
tailoring it to your needs.

==Dependencies
An attempt has been made to automatically compute dependencies for the latex 
task. The result is not perfect, but should work in many cases. It works this
way:
* when <tt>latex_file</tt> is called, after the block has been called, the main
  input file (the one which is passed to latex) is read, and the arguments of the 
  <tt>\includeonly</tt>, <tt>\include</tt> and <tt>\input</tt> latex commands are
  extracted. Of these files, those which already exist are added as dependency
  of the task. The others are skipped.
* when the latex task is executed, the main file is again scanned for the same
  commands. A file task is created and executed for each of the files which
  hadn't been considered previously. A task which can't be built is simply skipped

The reason for this two-step approach is that the main tex file may not be up
to date when <tt>latex_file</tt> is defined (for example, this is the case if
the file is generated by some other task), so some new depdendencies may not be
present. At the same time, some files which were included in the old version may
not exist in the new one. This could cause some unsatisfied dependencies if those
files don't exist, and this is the reason for which only existing files are added
as dependencies in the first step.

The second step is needed because, in the case of automatically generated tex files,
some of the included files may not exist and need to be generated before running
latex but, as explained above, this may not be known when the task is created.

Of course, the user can add dependencies by hand. The <tt>latex_file</tt> itself
doesn't accept dependencies as argument, but they can be given later to the task:

 latex_file('my_file.dvi')
 file 'my_file.dvi' => ['x.tex', 'y.tex']

==Multiple runs
One of the problems with writing a Makefile for LaTeX is that often latex needs
to be run more than once on the same file, before obtaining the final output.
Moreover, every LaTeX package may require other runs basing on different conditions
and maybe even requiring some other program to be run between latex invocations.

The approach followed here is to recognize the impossibility of a single algorithm
which works in every situation and to create an extensible one, instead, giving
the user the ability to modify it according to his needs.

This algorithm is based on _actions_, that is Procs which are called before and 
after each latex invocation. Those called before latex is run 
(<i>pre-run actions</i>)usually are used to read the content of some file and
store it in the runner, to be used later. The actions called after latex is run
(<i>post-run actions</i> usually read the same files again and compare the value
with that stored by the corresponding <i>pre-run action</i> in the runner (which
is used exactly as a Hash). According to the result of the comparison, they
can execute some command (for example, if an action detects that the index has
changed, it'll run the _makeindex_ command) and tell the runner whether other
invocations of latex are needed. This is decided by the return value of the action:
a value of 0 or *nil* means that the action doesn't need additional runs; a value
of 1 or more means that the actions requires at least that number of other runs.

This process is executed until all <i>post-run actions</i> return 0 and the number
of runs done is equal to the highest requested (for example: after the first run,
action _A_ returns 2, meaning that it requires two adjunctive runs. After the second
one all actions return 0, but the cycle is repeated another time all the same,
otherwise the previous request of action _A_ wouldn't be satisfied).

To avoid problems of endless loops, a there's a maximum number of runs, which is
5 by default. The user can change it in the runner.

This files contains two pair of <i>pre-run<i>/<i>post-run</i> actions. One
checks whether the references have changed, the other takes care of the
changes to the index file, calling <tt>makeindex</tt> when necessary.

<b>Note:</b> when <i>pre-run</i> and <i>post-run</i> actions are called, the
current directory is the directory where the output file will be put. This is
because most files the actions will be interested in will be there. Actions which
need to access files in the input dir (where the main tex file is) may do so
using the <tt>full_input_dir</tt> attribute of the runner object.

==Examples
===Simple LaTeX tasks
* <tt>latex_file('my_file.dvi') </tt>

  This will create a latex task to build the file 'my_file.dvi'from the file
  'my_file.tex' using latex. The default actions are used.

* <tt>latex_file('my_file.pdf')</tt>

  The same as above, but the output will be a pdf file created with pdflatex

* <tt>
    latex_file('output/my_file.dvi') do |runner|
      runner.main_file = input/my_file.tex
      runner.options << '-src-specials'
    end
  </tt>
  Here the output file and the input file are in different directories. It also
  tells latex to put source specials in the generated DVI using the -src-specials
  command line option
===Tasks using custom action
In the previous examples, only the default actions were used. This means that
multiple runs of latex were only done if either the labels or the index changed.
This is enough if the latex file doesn't use particular packages, which require
more than one run. One such package is feynmf (used to draw Fyenman diagrams in
physics). When latex processes a file using this package, a file with extension
.mf is produced (the name is chosen by the user in the latex file). This file
needs to be processed using metafont, then latex needs to be run again. This
requires writing some custom actions (in particular, a pre-run action and a
post-run action). Here's the code of the pre-run one:
<tt>
  pre_action = lambda do |runner|
    unless runner[:old_diagrams_mf]
      contents = File.exist?('diagrams.mf') ? File.read('diagrams.mf') : nil
      runner[:old_diagrams_mf] = contents
    end
  end
</tt>
The lines inside the unless statement read the contents of the 'diagrams.mf' file
(the one produced by the feynmf package after the first run of latex) and store 
the contents of the file under the :old_diagrams_mf key in the runner object. If
the file doesn't exist, the :old_diagrams_mf entry will be nil. The unless statement
is there only to avoid reading the diagrams.mf file after the first run of latex:
as we'll see, the corresponding post-run action will change the :old_diagrams_mf
to contain the new contents of the file, after latex is first run. At this point,+
in following runs, the contents of the file won't change, so reading it again would
only be a waste of time. <b>Note:</b> since the diagram.mf file is generated in
the output directory, and that when actions are called the current directory _is_
the output directory, there's no need to worry about paths.

Here's the code for the post-run action:
<tt>
  post_action = lambda do |runner|
    new_contents = File.exist?('diagrams.mf') ? File.read('diagrams.mf') : nil
    if new_contents == runner[:old_diagrams_mf] then 0
    else
      runner[:old_diagrams_mf] = new_contents
      sh "mf #{diagrams.mf}"
      1
    end
  end
</tt>

Here's what's happening: first the diagrams.mf file is read again, and its
contents (or nil if it doesn't exist) are stored in <i>new_contents</i>; then,
the new contents of the file are compared with the old ones, stored in the 
<tt>:old_diagrams_mf</tt> entry of the runner object. If the contents haven't
changed the proc returns 0, meaning that, as far as it is concerned, there's no
need to run latex again. If the contents have changed, instead, the new contents
are stored, metafont is run (the name of the executable is mf) and 1 is returned.
This means that this action asks the runner to run latex again at least one time.

Finally, here's the call to <tt>latex_file</tt> to actually define the task:
<tt>
  latex_file('my_file.dvi') do |runner|
    runner.pre_run_actions << pre_action
    runner.post_run_actions << post_action
  end
</tt>

==Problems
* nested dependencies (an \input-ed file which contains another \input) aren't
  recognized
* a file which was \included (or \inputed) in the old version of an automatically
  generated tex file, which is not included in the new version but still exists
  and changes causes the output file to be rebuilt (only the first time)
* the process of finding dependencies isn't customizable
* the action which checks for changed references causes a second latex invocation
  every time the .aux file changes. I don't know enough of the contents of the
  .aux file to parse it, but I need to base the decision on it, because messages
  such as "rerun to get cross-references right" in the latex log file aren't
  reliable (for example, the when the hyperref package is used, they aren't
  produced)
=end

require 'rake'
require 'strscan'
require 'md5'
require 'delegate'
require 'set'

class String

  unless String.instance_methods.include?('end_with?')
    
=begin rdoc
  Returns true if the receiver ends with _str_ and false otherwise.
    
  ====Note
  This method is only defined if it doesn\'t already exist. If it does (for example
  because using ruby 1.8.7 or the facets library), the old version is used.
=end
    def end_with? str
      self[(-str.size)..-1] == str
    end
  end

=begin rdoc
  Changes the extension of the string (that is, the part of the string after the
  last dot) with <i>new_ext</i>
  ====Notes
  * if <i>new_ext</i> is empty, the extension of the file and the dot before it
    will be removed, thus giving a file with no extension
  * if the file doesn't have an extension, <i>new_ext</i> will be added to it
  * it's not necessary to include a leading dot in <i>new_ext</i>. If it isn't
    included, it will be added automatically (only if <i>new_ext</i> isn't empty)
=end
  def change_extension new_ext
    new_ext = '.'+new_ext unless new_ext[0,1] == '.' or new_ext.empty?
    res = dup
    res = res + new_ext unless res.sub!(/\..+$/, new_ext)
    res
  end
  
end

=begin rdoc
  :call-seq: latex_file(name){|runner|...}

  Adds a task to build a latex file. _name_ is the name of the file to produce.

  If a block is given, it's called passing it the +LaTeXRunner+ which will be 
  used to build the file. See +LaTeXRunner+ for examples.
=end
def latex_file name
  name = name.to_s
  runner = LaTeXRunner.new
  runner.main_file = name.change_extension '.tex'
  yield runner if block_given?
  tsk = file( name => runner.main_file){runner.execute}
  runner.task = tsk
  tsk
end

=begin rdoc
Class which encapsulates all the information needed to build a latex task. It
contains both the parameters to run latex, many of which can be changed by the
user, and the values stored by the actions. Actions can store any value they
like, using this class as if it were a Hash. Indeed, using +DelegateClass+, 
instances of this class have exactly the same behavior of a Hash.
=end
class LaTeXRunner < DelegateClass(Hash)
  
=begin rdoc
  The default programs to use to process latex files, depending on the extension
  of the file to generate. To add a default program for another extension, simply
  add the corresponding entry here. To change the default program for a given 
  extension for all the latex tasks, change it here. To change the program used
  in a specific task, change it in the corresponding runner
=end
  PROGRAMS = {
    'dvi' => 'latex',
    'pdf' => 'pdflatex'
  }
  
=begin rdoc
  Default procs which are called before each run of latex, usually to store data
  relative to files which will help decide whether latex should be run a second
  time.
=end
  PRE_ACTIONS = {}
  
=begin rdoc
Default procs which are called after each run of latex, usually to decide whether
latex needs to be run again (and how many times). Currently, two of them are 
provided:
* one which checks if references have changed
* one which checks if indexes have changed
=end
  POST_ACTIONS = {}
    
  PRE_ACTIONS[:aux] = lambda do |runner|
    auxs = Set.new
    Dir.glob('*.aux'){ |f| auxs << runner.compute_digest( f)}
    runner[:old_auxs] = auxs
  end
  
  PRE_ACTIONS[:index] = lambda do |runner|
    index = File.basename(runner.task.name).change_extension('.idx')
    runner[:old_idx] = runner.compute_digest( index) unless runner.has_key?(:old_idx)
  end
  
  POST_ACTIONS[:aux] = lambda do |runner|
    auxs = Set.new
    Dir.glob('*.aux'){ |f| auxs << runner.compute_digest( f)}
    if auxs == runner[:auxs] then 0
    else
      runner[:auxs] = auxs
      1
    end
  end
    
  POST_ACTIONS[:index] = lambda do |runner|
    index = File.basename(runner.task.name).change_extension('.idx')
    new_dig = runner.compute_digest( index)
    if runner[:old_idx] == new_dig then return
    else
      runner[:old_idx] = new_dig
      sh "makeindex #{index}"
      1
    end
  end
  
=begin rdoc
  The name of the file on which latex should be invoked. The input directory is
  determined from this.
=end
  attr_accessor :main_file
  
=begin rdoc
  The maximum number of times latex can be run. Set it to nil if you want to
  remove the limit (actually, the limit is not removed, but set to 1000, which
  should be high enough to be considered infinity).
=end rdoc
  attr_accessor :max_runs
=begin rdoc
  The name of the program to use to process the .tex files for this runner. A
  default value is computed looking at the extension of the output file, using
  the PROGRAMS hash. To change it only for a single task, set this attribute to
  the correct program yourself
=end
  attr_accessor :program
=begin rdoc
  The <tt>Rake::FileTask</tt> associated with the runner
=end
  attr_reader :task
  
# The directory where the input file is located. It is determined automatically
# basing on the <tt>main_file</tt> attribute
  attr_reader :input_dir

# The full path of the input directory
  attr_reader :full_input_dir
  
# An array containing the actions to run before each run of latex. It defaults
# to the actions which check the contents of the .aux and .idx files
  attr_reader :pre_run_actions
  
# An array containing the actions to run after each run of latex. It defaults
# to the actions which check whether the references or the index have changed
  attr_reader :post_run_actions
  
# An array of the options to pass to latex. Defaults to <tt>['-interaction=batchmode']</tt>
  attr_reader :options
  
# The directory where the output file should be created. It is determined automatically
# from the name of the task
  attr_reader:output_dir
  
# The full path of the output directory
attr_reader :full_output_dir

#Creates a new +LaTeXRunner+
  def initialize
    @hash = {}
    super @hash
    @task = nil
    @main_file = nil
    @input_dir = nil
    @full_input_dir = nil
    @output_dir = nil
    @full_output_dir = nil
    @pre_run_actions = [:aux, :index].map{|i| PRE_ACTIONS[i]}
    @post_run_actions = [:aux, :index].map{|i| POST_ACTIONS[i]}
    @max_runs = 5
    @cmd = nil
    @program = nil
    @options = ['-interaction=batchmode']
  end

#Sets the task associated with the runner to _tsk_
  def task= tsk 
    @task = tsk
    @output_dir = File.dirname(@task.name.to_s)
    @full_output_dir = File.expand_path(@output_dir)
    deps = find_included_files @main_file, true
    file(@task.name => deps)
  end
  
#Sets the main file (the one on which latex is invoked) to _file_. The
#<i>input_directory attribute is set accordingly
  def main_file= file
    @main_file = file
    @input_dir = File.dirname(@main_file)
    @full_input_dir = File.expand_path(@input_dir)
  end

=begin rdoc
  Starts the build process for the output file.
  
  It determines the latex command line, scans the main tex file for
  new dependencies and execute them, then calls <tt>run_latex_as_needed</tt>.
=end
  def execute
    @program ||= PROGRAMS[File.extname(@task.name)[1..-1]]
    @cmd = "#@program #{@options.join ' '} "\
        "-output-directory=#@full_output_dir #{File.basename(@main_file)}"
    new_deps = find_included_files( @main_file, false) - @task.prerequisites
    new_deps.each{|d| file(d).invoke}
    run_latex_as_needed
  end
  
# Computes a MD5 digest for the given file, reading the file line by line. Returns
# *nil* if the file doesn't exist  
  def compute_digest file
    return nil unless File.exist? file
    dig = Digest::MD5.new
    File.foreach(file){|l| dig << l}
    dig.hexdigest
  end
  
  
  private
  
=begin rdoc
  Invokes latex until the final output is produced, with a maximum given by
  the <tt>max_runs</tt> attribute.
=end
  def run_latex_as_needed
    self[:remaining_runs] = 1
    max_runs = @max_runs || 1000
    max_runs.times do 
      run_latex_once
      break if self[:remaining_runs] < 1
    end
  end
  
=begin rdoc
  Manages a single invocation of latex. This means:
  * calls all the specified <i>pre-run actions</i>
  * executes latex
  * calls all the specified <i>post-run actions</i>
  * adds the highest of the numbers returned by the <i>post-run actions</i> to
    the number of remaining runs
=end
  def run_latex_once
    self[:remaining_runs] -= 1
    Dir.chdir(@full_output_dir) do
      @pre_run_actions.each{|a| a.call self}
    end
    Dir.chdir(@input_dir) do
      puts @cmd
      self[:latex_output] = `#@cmd`
      raise RuntimeError, "There where #@program errors. "\
          "See #{@task.name.change_extension '.log'} for details" if $? != 0
    end
    Dir.chdir(@full_output_dir) do
      self[:remaining_runs] += @post_run_actions.inject(0){|res, a| [res, a.call(self) || 0].max }
    end
  end
  
=begin rdoc
Finds the files given as arguments to the <tt>\includeonly</tt>, <tt>\include</tt>
and <tt>\input</tt> latex command in file _file_ and returns an array of their 
names (relative to the directory where rake is executed). If <i>only_existing</i>
is true, names of non existing files won't be included.
=end
  def find_included_files file, only_existing
    return [] unless File.exist? file
    deps = []
    File.foreach(file) do |l|
      sc = StringScanner.new l
      until sc.eos?
        if sc.scan( /\\{2}/) or sc.scan(/\\%/) then next
        elsif sc.scan(/%/) then sc.terminate
        elsif sc.scan(/\\(?:include|input|includeonly)\{([^}]+)\}/) 
          new_dep = File.join(@input_dir, sc[1])
          new_dep += '.tex' unless new_dep.end_with? '.tex'
          deps << new_dep unless only_existing and !File.exist?(new_dep)
        else sc.pos += 1
        end
      end
    end
    deps
  end
  
end