Recent Changes - Search:

WikiDoc

Categories

Batch-like interaction via pipes

<Joachim 21/10/2003>

NOTE: Everything described on this page is pure Tcl and works equally well from tclsh.


Batch-like interaction with a unix process is meant to describe the situation where some external process is running (a sort of server, if you like) to which you can send commands for evaluation and get the result back as a return value.

Of course every interaction is more or less like this --- the concept is meant to be in contrast to shell-like interactions in that there is no window to receive the results as they come asynchronously --- instead we wait until the result is there and get it in a pure stripped-off form as the value of a variable. Hence it can be used in scripts as a sort of extension.


Maxima

As an example, the computer algebra system MAXIMA is considered (uppercased to distinguish from the Tcl command maxima introduced below). The MAXIMA project (one of the oldest in symbolic computation, and a primary source of inspiration for Maple) is hosted at SourceForge. The best way to install it under OSX is probably with the help of Fink http://fink.sourceforge.net. Note that you currently need to grant access to the unstable tree...

The simplest form for batch-like interactions is exec: you spawn some process with a one-liner, gets back the result, and then the process dies again. Many math programmes allow some sort of one-line invocation like this. For example you can call Maxima (from a unix shell) with a command like

 
    &#036; maxima --batch-string="diff(sin(x),x);"

This will return a dozen of lines of welcome header, and then the result you are interested in "COS(x)". From within Tcl it is not that difficult to wrap this in an exec and filter the output to get a little procedure that can evaluate MAXIMA expressions. (This is the proc maxima::maximaBatch below.)

However the main emphasis of this page is on the slightly different situation where instead of having such short-lived processes we want to have a process running on a more permanent basis, and send instructions to it and get back the results. There are at least two reasons to want this: one is that if you have many calculations to do it is more economical to keep a single MAXIMA session running than to spawn a new session for each command. The second reason is that in this way the state of the external process is preserved between the calls, and you can build up a complicated computation step by step.

Design

In the example below, a procedure maxima is defined in Tcl: it takes a list of arguments and send it to a running maxima process for evaluation in the currest state of the system. Hence you can do for example

 
    % maxima f : (x+y)^4
    % maxima g : (x-2*y)^2
    % maxima diff( f+g, x )
    % maxima expand( % )

and the final return value will be the string

 
    4*y^3+12*x*y^2+12*x^2*y-4*y+4*x^3+2*x

Sometimes, if you are sending such commands from a Tcl script, you can also get away with keeping the record of the state of the variables in Tcl, and hence make every command a one-liner not depending on the internam state of MAXIMA. In the simple example above it would be just as easy to do:

 
    % set f "(x+y)^4"
    % set g "(x-2*y)^2"
    % set res [maxima diff( &#036;f+&#036;g, x )]
    % maxima expand( &#036;res )

Here comes a simple maxima batch interaction implementation. The code can also be downloaded from http://www.cirget.uqam.ca/~kock/alpha-tcl/maxima.tcl . At some point this should all evolve into a maxima mode for AlphaTcl. (The file also contains support for a variant that returns in LaTeX format. This is used with the notion of Pipes.Worksheets to have active MAXIMA cells in a LaTeX document.)

 
  # This is a batch-like interface to MAXIMA. The command [maxima] (which 
  # is just synonym for [maxima::maxima]) will send all its arguments to
  # evaluation by MAXIMA and return the result.  When called isolated, 
  # MAXIMA will be spawned and then quit immediately after returning the 
  # result.  Alternatively, a MAXIMA session can be started with the 
  # command [maxima::start].  In this case subsequent calls to [maxima] 
  # will be evaluated in this session (until the session is killed with 
  # [maxima::stop]).  The advantages of this approach is that it is more
  # economical to call an existing process than to spawn a new one for 
  # each call, and second, that the state of MAXIMA is preserved, so that 
  # succesive calls can refer to MAXIMA variables (either user-defined 
  # variables or labels like D5).  (There is also a variant proc [maximaTex]
  # which returns the result in TeX, which is useful in live math TeX 
  # documents (worksheets).)
  #
  # Examples:
  # 
  # % source maxima.tcl
  # % maxima factor( 23423412342131234 )
  # 2*131*173491*515313977
  # % set f [maxima expand( (x+y+z)^4 )]
  # z^4+4*y*z^3+4*x*z^3+6*y^2*z^2+12*x*y*z^2+6*x^2*z^2+4*y^3*z+12*x*y^2*z
  # 	+12*x^2*y*z+4*x^3*z+y^4+4*x*y^3+6*x^2*y^2+4*x^3*y+x^4
  # % maxima factor( &#036;f )
  # (z+y+x)^4
  # 
  # Installation: first of all you need to have maxima installed on your
  # computer.  The easiest way if probably using fink.  Now write this 
  # line to your prefs.tcl file:
  # 
  #    source /path/to/thefile/maxima.tcl
  #    
  #    





  namespace eval maxima {}

  # The array data contains:
  #     transcript -- a log of everything done (like a shell)
  #     res -- last result
  #     label -- label of the last result (e.g. (D5))
  #     size -- the offset (in the transcript) of the previous feedback,
  #             including the prompt (e.g. just after (C5))

  proc maxima::start { } {
      # Close any old pipe:
      stop
      # Initialise the array:
      variable data
      set data(size) 0
      # Open the pipe:
      set data(pipe) [open "|maxima" RDWR]
      fconfigure &#036;data(pipe) -buffering line -blocking 0
      # Set up a handler for the output:
      fileevent &#036;data(pipe) readable ::maxima::receive
      # Wait until the receiver has finished with the MAXIMA welcome header:
      vwait ::maxima::data(res)
      # Then send a configuration instruction:
      maximaRunningBatch {DISPLAY2D:FALSE&#036;}
      return
  }

  proc maxima::stop { } {
      variable data
      # Close the pipe:
      catch { close &#036;data(pipe) }
      # Reset the data array:
      unset -nocomplain data
  }

  proc maxima::isRunning { } {
      variable data
      if { [info exists data(pipe)] } {
	  return [expr { [lsearch -exact [file channels] &#036;data(pipe)] != -1 }]
      } else { 
	  return 0 
      }
  }



  # Event handler for data(pipe).  Writes the result to the variable 
  # data(res), from where the proc [maxima] reads it immediately.
  proc maxima::receive { } {
      variable data
      if { [eof &#036;data(pipe)] } {
	  # There is nothing more to read --- just stop:
	  stop
	  return
      }
      # Just read as much as possible, and append it:
      append data(transcript) [read &#036;data(pipe)]
      # When we read a promt, stop reading:
      if { [regexp -start &#036;data(size) -- {\(C\d+\)\s*&#036;} &#036;data(transcript)] } {
	  # We have found a new prompt.  Hence the result should be just before that:
	  if { [regexp -start &#036;data(size) -- {.*\((D\d+)\)(.*)\(C\d+\)\s*&#036;} &#036;data(transcript) "" data(label) tmpRes] } {
	      set data(res) [string trim &#036;tmpRes]
	  } else {
	      # We should only come in here at startup, then we need to set data(res)
	      # since the caller is waiting for this variable to be set:
	      set data(res) ""
	  }
	  set data(size) [string length &#036;data(transcript)]
      }
  }

  # Requires trailing semicolon
  proc maxima::maximaRunningBatch { cmd } {
      variable data
      # First update the transcript:
      append data(transcript) &#036;cmd
      # Set up a handler for the output:
      fileevent &#036;data(pipe) readable ::maxima::receive

      # Then send the command:
      puts &#036;data(pipe) &#036;cmd
      # Timeout mechanisms: We are going to wait for the variable res.  
      # Make sure it is written at least ofter some time:
      set timeout [after 10000 {set ::maxima::data(res) "TIMEOUT"}]
      vwait maxima::data(res)
      # If we have come so far there is no more need for the time bomb:
      after cancel &#036;timeout
      if { [string equal &#036;data(res) "TIMEOUT"] } {
	  stop
	  error "TIMEOUT"
      }
      return &#036;data(res)
  }


  # This proc doesn't use pipes or file events or anything.
  # 
  # Requires trailing semicolon
  proc maxima::maximaBatch { cmd } {
      set preCmd {DISPLAY2D:FALSE&#036; }
      set transcript [exec maxima --batch-string=&#036;{preCmd}&#036;{cmd}]
      if { [regexp -- {.*\n\(D\d+\) (.*)\nBye.&#036;} &#036;transcript "" res] } {
	  return &#036;res
      } else {
	  regexp -- {.*\n\(C\d+\)[^\n]*\n(.*)\nBye.&#036;} &#036;transcript "" err
	  error &#036;err
      }
  }

  proc maxima::maxima { args } {
      variable data
      # Build the command:
      set cmd [join &#036;args " "]
      set cmd [string trim &#036;cmd]
      if { ![string equal [string index &#036;cmd end] {;}] } {
	  append cmd {;}
      }
      # See if we have maxima running already:
      if { [isRunning] } {
	  return [maximaRunningBatch &#036;cmd]
      } else {
	  return [maximaBatch &#036;cmd]
      }
  }




  # ====================================================================
  # Here comes the part with tex formated output


  # This proc doesn't use pipes or file events or anything.
  # 
  # Requires trailing semicolon
  proc maxima::maximaTexBatch { cmd } {
      set preCmd {DISPLAY2D:FALSE&#036; }
      set transcript [exec maxima --batch-string=&#036;{preCmd}&#036;{cmd}tex(%)\;]
      if { [regexp -- {.*\n\(D\d+\).*\&#036;(\&#036;.*\&#036;)\&#036;\n\(D\d+\).*\nBye.&#036;} &#036;transcript "" res] } {
	  return &#036;res
      } else {
	  if { [regexp -- {.*DISPLAY2D : FALSE\n(.*)\n\(C\d+\)} &#036;transcript "" err] } {
	      error &#036;err
	  }
	  error &#036;transcript
      }
  }


  proc maxima::receiveTex { } {
      variable data
      if { [eof &#036;data(pipe)] } {
	  # There is nothing more to read --- just stop:
	  stop
	  return
      }
      # Just read as much as possible, and append it:
      append data(transcript) [read &#036;data(pipe)]
      # When we read a promt, stop reading:
      if { [regexp -start &#036;data(size) -- {\(C\d+\)\s*&#036;} &#036;data(transcript)] } {
	  # We have found a new prompt.  Hence the result should be just before that:
	  if { [regexp -start &#036;data(size) -- {.*\&#036;(\&#036;.*\&#036;)\&#036;\n\(D\d+\) FALSE\n\(C\d+\)\s*&#036;} &#036;data(transcript) "" tmpRes] } {
	      set data(res) [string trim &#036;tmpRes]
	  } else {
	      # We should only come in here at startup, then we need to set data(res)
	      # since the caller is waiting for this variable to be set:
	      set data(res) ""
	  }
	  set data(size) [string length &#036;data(transcript)]
      }
  }



  # Requires trailing semicolon
  proc maxima::maximaTexRunningBatch { cmd } {
      variable data
      # First we just send the command to usual evaluation:
      maximaRunningBatch &#036;cmd

      # Now send the tex formating command

      # Set up a handler for the output:
      fileevent &#036;data(pipe) readable ::maxima::receiveTex
      # Send the tex formating command
      puts &#036;data(pipe) tex(%)\;

      # Timeout mechanisms: We are going to wait for the variable res.  
      # Make sure it is written at least ofter some time:
      set timeout [after 10000 {set ::maxima::data(res) "TIMEOUT"}]
      vwait maxima::data(res)
      # If we have come so far there is no more need for the time bomb:
      after cancel &#036;timeout
      if { [string equal &#036;data(res) "TIMEOUT"] } {
	  stop
	  error "TIMEOUT"
      }
      return &#036;data(res)
  }

  proc maxima::maximaTex { args } {
      variable data
      # Build the command:
      set cmd [join &#036;args " "]
      set cmd [string trim &#036;cmd]
      if { ![string equal [string index &#036;cmd end] {;}] } {
	  append cmd {;}
      }
      # See if we have maxima running already:
      if { [isRunning] } {
	  return [maximaTexRunningBatch &#036;cmd]
      } else {
	  return [maximaTexBatch &#036;cmd]
      }
  }



  namespace eval maxima {
      namespace export maxima maximaTex
  }

  namespace import -force maxima::maxima maxima::maximaTex

Some implementation issues

A general problem to cope with is to determine when the output is complete. The output might consists of several lines and it might take a very long time before the last line of output arrives if it is a complicated calculation. The only way seems to be to look out for the prompt...

Second, since the proc that sends the command to the external process cannot also be the event handler, it is tricky for the calling proc to get the result back from the event handler and return it to the user. One solution which seems to work is to let the event handler write the result to a global variable when it is sure that the output is complete; then the calling proc waits (using vwait) for this variable to be written and when this happens its value is returned to the user. This is an awkward way of making the inherently asynchronous filehandler situation look like something synchronous which can be used in a usual linear script. Another possibility which is perhaps more direct is to let the calling proc have its own little ad hoc waiting scheme, instead of relying on a file event... It could simply go into a little loop that for every fraction of a second reads what ready in the pipe and continue like that until the prompt appears...

Emulating prompts

Further complications arise in the cases where the prompt is lost. For example it seems that maple only returns complete lines, so the prompt will only be sent back to the tcl interface as part of the next command. To handle such a situation one can envisage to wrap every command in a dummy command to mark end of output. For example in maple one could make the internal convention always to send command foo as foo; ENDOFOUTPUT;. Maple will then first process the command foo (the extension of whose output we have no control over per se), and afterwards process the command ENDOFOUTPUT which is not a known maple command, and will therefore just be repeated verbatim in the output. The callback proc will then patiently wait for the string ENDOFOUTPUT to appear in the stream, and when it does it will be filtered away, and the remaining text (the output from foo) will be written to the global variable res where the original procs avwaits it.

Implementation without fileevent

In a sense it is a little backwards to use an event loop to handle the output since we have to wait until the output is complete before we really handle it. Here is an attempt to implement maxima without fileevent, setting up instead its own little personal event loop. I don't know if there are any advantages to this... This has not been given further attention, and the script (downloadable from http://www.cirget.uaqm.ca/~kock/alpha-tcl/alt-maxima.tcl ) is not up-to-date compared to the mainstream maxima script above.

Page last modified on April 19, 2006, at 11:51 PM
Hosted on SourceForge.net Logo