qube99 Posted November 29, 2013 Posted November 29, 2013 I am writing a pretty elaborate script, at least for me it's elaborate. It features about 5 levels of nested conditional loops all based on random numbers of various types. I ran into one stage where a script was calling itself recursively which FMP choked on. So what I did was to write each loop as a separate script and this works great for me. I used a bunch of global variables plus local variables. The thing is now in a bunch of small, highly manageable chunks. This also makes the scripts a lot easier to understand since the whole thing was way too much to wrap my head around. I see large complex scripts all the time so I began to wonder if what I did is a Best Practice? Now that I have it all working should I rewrite it into one monster script with a custom function for the recursion? Or is it okay to just have a bunch of small scripts (and no custom functions) all calling each other in loops while sharing global variables? I found it easier code the innermost loop first and then work my way outward one loop at a time. When I tried to start with the outermost loop I simply was unable to envision the thing in detail even though I knew exactly what I wanted to do. Any single loop was easy enough to see but they always needed that inner loop to code and test. There's just no way to write the outer loop until the inner loops were done. I suppose this is the same way that calculations are evaluated, innermost statements first and then work their way outward.
jbante Posted December 3, 2013 Posted December 3, 2013 It's perfectly find, even preferable, to keep your process split out into separate scripts. There are plenty of arguments for this from different sources, but I personally think that the best reason is that human working memory is one of the biggest bottlenecks in our ability to write software. It's dangerous to put more functionality in one script (or calculation, including custom functions) than you can keep track of in your head. If you find yourself writing several comments in your scripts that outline what different sections of the script do to help you navigate the script, consider splitting those sections of the script into separate sub-scripts, using the outline comments for the sub-script names. If your parent scripts start to read less like computer code and more like plain English because you're encapsulating functionality into well-named sub-scripts, you're doing something right. (The sub-scripts don't have to be generalized or reusable for this to be a useful practice. Those are good qualities for scripts, but do the encapsulation first.) I think you'd be better off if the scripts shared information with script parameters and results instead of using global variables, if you can help it. When you set a global variable, you have to understand what all your other scripts will do with it; and when you use the value in a global variable, you have to understand all the other scripts that may have set it. This has the potential to run up against the human working memory bottleneck very fast. With script parameters and results, you only ever have to think about what two scripts do with any given piece of information being shared: the sub-script and its parent script. Global variables can also lead to unintended consequences if the developer is sloppy. If one script doesn't clear a variable after the variable is no longer needed, another script might do the wrong thing based on the value remaining in that global variable. With script parameters, results, and local variables, the domain of possible consequences of a programming mistake are contained to one script. I might call it a meta best practice to use practices that limit the consequences of developer error. Globals are necessary for some things; just avoid globals if there are viable alternative approaches. There is a portability argument for packing more functionality into fewer code units — fewer big scripts might be easier to copy from one solution into another than more small scripts. Big scripts are also easier to copy correctly, since there are fewer dependencies to worry about than with several interdependent smaller scripts that achieve the same functionality. For scripts, this is often easy to solve with organization, such as by putting any interdependent scripts in the same folder with each other. Custom functions don't have folders, though. For custom functions, another argument for putting more in fewer functions is that one complicated single function can be made tail recursive, and therefore can handle more recursive calls than if any helper sub-functions are called. This ValueSort function might be much easier to work with if it called separate helper functions, but I decided that for this particular function, performance is more important. 4
qube99 Posted December 4, 2013 Author Posted December 4, 2013 Looks like you;'ve given this a great deal of thought. I had run into the limits of human thought but never stated it so clearly.A major part of solving a problem is to state it clearly, which you have done. I don't know how long these scripts in my current project would have taken to write if I had not broken the thing into subscripts. I tried but it was mentally overwhelming. I was forced to break the problem into smaller peices and deal with each in turn. I got it working properly and am now cleaning up the scripts. It gets a little messy when ya do countless script revisions over many days. I have a current project that uses a number of global variables. My thinking is a variable that will be used millions (even billions) of times in a calculation is best created available in memory rather than trying to scope it and have it created and calculated over and over again. I also wanted to avoid all the disk I/O calls possible so I used global variables instead of fields. This approach required me to spend a long afternoon tracing every single variable to make sure it was properly dealt with. I wanted to ensure that I wasn't wasting resources with orphaned variables that can occur in a heavily revised project. There's nothing quite like tracking down a memory leak in a delivered project. Much better to deal with it while in development. Same goes for fields where data is appended. One thing I did to help was to comment most of the variables in addition to descriptive names and tell which subscript would use it and which script it came from. Very helpful for low storage minds like mine. Yeah it took time to do that but I will probably return to this project perhaps in a few years so I need to use the system to remember things for me. The subscripts were useful in another way. They could be toggled on and off from the master script. This allowed me to use multiple variations of a subscript I could quickly swap. Funny how a script that works fine will suddenly go insane when another script operating on the same data is added. After I was satisfied the extras were simply deleted from the script directory. I may redo the calculation part of this project in C++ but I must say that employing FMP made it all dramatically easier to develop. I would go so far as to call FMP a rapid development environment for this sort of work. I used FMP's excellent display functions to show me what was happening. Impossible to do that in an ANSI/ISO console application without coding some sort of interface that poorly duplicates what FMP already does so well.
jbante Posted December 4, 2013 Posted December 4, 2013 I have a current project that uses a number of global variables. My thinking is a variable that will be used millions (even billions) of times in a calculation is best created available in memory rather than trying to scope it and have it created and calculated over and over again. I also wanted to avoid all the disk I/O calls possible so I used global variables instead of fields. This approach required me to spend a long afternoon tracing every single variable to make sure it was properly dealt with. I wanted to ensure that I wasn't wasting resources with orphaned variables that can occur in a heavily revised project. There's nothing quite like tracking down a memory leak in a delivered project. Much better to deal with it while in development. Same goes for fields where data is appended. Beware premature optimization. I have never, ever seen a FileMaker solution where the processing overhead of converting script parameters and results into local variables has been the performance bottleneck relative to the rest of the scripted process. You're right to point out that there's a balance to be struck between what makes the computer faster and what makes the developer faster, but I don't think script parameters vs. global variables is a particularly fruitful nit to pick for making the computer faster.
qube99 Posted December 5, 2013 Author Posted December 5, 2013 I wasn't speaking generally but about my specific project. I looked at all the global variables in it (for the umpteenth time) and I only see one place where a script parameter could be used. In all other circumstances I cannot imagine that a parameter could be used without constructing some Rube Goldberg methods of parsing them, say passing 25 variables used in multiple subscripts in a single parameter. This specific application is pretty unique and not applicable to normal business logic. What's of utmost importance at this stage of development is speed. Some of the calculations I will attempt with this system are truly immense. Every millisecond counts a great deal. One problem I dream of solving some day involves examining 6.4 quadrillion nodes to gather statistics on the 21 trillion unique states contained in them. That'll probably require a C++ application (or a quantum computer) as I doubt that FMP can ever run fast enough. But I do want to use FMP for the largest problems it can reasonably solve. What's reasonable? Good question. Joseph Becker ran a similar C++ enumerator for 7 months non-stop to find the answer to this combinatoric question: http://smallpuzzlecollection.blogspot.com/2011/12/one-in-trillion.html And risky. There was no guarantee that Becker would find any solutions at all! I am now working mostly on user features (grunt work) and starting to think about optimization. I certainly don't know enough about FMP and am just learning. One thing I could sure use is a way of measuring time in milliseconds to avoid conducting tests that take too long. For instance, can a number be passed faster as a variable or as a parameter? How about passing text? Which is faster, Set Field, Insert Calculated Result or Replace Field Contents? Can a conditional run faster as a text string or would it run faster as Unicode? Same for scripted finds, what's the fastest way? Most optimization will be achieved in the logic. For instance, I don't want to waste time creating duplicates so I run a test. That gave a big boost. But there's a LOT of duplicates and it's best if they never get processed then eliminated in the duplicate test script. For instance, I noticed that when a parent spawns an anti-child it is creating a duplicate. The child is it's own grandparent. So I wrote a filter to not even look at an anti-child. That gave another big speed boost. I also see that spawning more than S/2 identical members in a row creates a duplicate of an anti-member. Visualize turning a dial past halfway is the same as turning the dial the other way a lesser amount. I haven't implemented this filter yet but it's quite certain to offer another big speed boost. Can I ever eliminate all sources of duplicates so that the scripted find can also be eliminated? Probably not. As the levels increase the combinatorics of duplicates also increase. Maybe there's a predictable expanding series that I can recognize. I will try to solve the filters one by one and see. If I can prefilter all duplicates then there's no need to create records at all. They are only there so that duplicates can be found. It's the statistics that really count, not individual records.That'd be a pretty big speed boost. In order to determine the best script practices for my particular project I'd like to learn... How to measure elapsed time in milliseconds and How to show progress while the main window is frozen Once I can measure time in milliseconds I can test functions and data types (without running tens of thousands of iterations). The results might be applicable across a broad spectrum of projects. I'll document what I find in these tests. I realize that saving a few milliseconds here and there is utterly meaningless to normal FMP projects. For me it will mean saving days and weeks of computing time.
jbante Posted December 5, 2013 Posted December 5, 2013 In order to determine the best script practices for my particular project I'd like to learn... How to measure elapsed time in milliseconds and How to show progress while the main window is frozen Once I can measure time in milliseconds I can test functions and data types (without running tens of thousands of iterations). The results might be applicable across a broad spectrum of projects. I'll document what I find in these tests. I realize that saving a few milliseconds here and there is utterly meaningless to normal FMP projects. For me it will mean saving days and weeks of computing time. To measure time in milliseconds, use the new Get ( CurrentTimeUTCMilliseconds ) function in FileMaker 13 (Get ( UTCmSecs ) in 12). To show progress for this sort of thing in the past, I'd set a timer variable to compare against and refresh and re-freeze the window periodically to show progress. Saving a few milliseconds here and there is absolutely meaningful to normal FileMaker projects! It can be easier to remember the faster ways to do things when it counts if we make it a habit to do things the faster way when it counts less. Also, any well-designed script (or calculation) could wind up being a sub-script of another script, and another, and another, possibly eventually finding itself in an inner loop, and then performance becomes a big deal very fast. Further, milliseconds on the desktop can become seconds in WAN deployments, and seconds means that the computer is the bottleneck rather than the human user, which is a situation to avoid whenever possible. However, I'm disinclined to spend my time improving the speed of something that takes tens of milliseconds until I know I've already optimized the bejeezus out of everything that takes hundreds of milliseconds. When I am really out of low-hanging fruit, by all means I'll try to get tens of milliseconds down to ones of milliseconds. Ask me about my experience writing UUID functions sometime. Passing 25 variables in a single parameter is a lot, but not at all unheard of. The "Rube Goldberg contraption" for creating and parsing multiple parameters to a script is a problem that other developers have already worked on, and the solutions aren't that complicated at all. Check out Let notation and the #Parameters module. The module is geared towards feature richness more than speed, so you may get faster performance if you build Let notation yourself rather than using the custom functions. Also, if you still have to pass 25 parameters to a sub-script, it may be worth considering splitting the sub-script up even further. It may wind up not making sense to split the scripts any further for your particular application, but I usually take more than 3 parameters as a cue to start considering alternative designs — that's considering, not necessarily finding and adopting.
David Jondreau Posted December 5, 2013 Posted December 5, 2013 I think you're barking up the wrong tree using FileMaker...maybe just barking! That said, I'm not sure what the specifics are, but if you're looking to avoid duplicate records, you can prevent them on creation by using validation. That's probably faster than searching and deleting them later. I show progress by using a global variable on a layout that refreshes every x passes of a loop or stage in a script. 1
qube99 Posted December 6, 2013 Author Posted December 6, 2013 I was unable to figure out how to use validation in my scenario. I need a boolean result for branching logic and I was unable to get validation to return anything other than 0 for get(lastError). My script goes haywire with it so I need a way to control the thing. Maybe someone has a method they can share. FMP is really not going to succeed at anything other than the smallest problems I'm working on. It has proved invaluable as a rapid development environment. I'll recode the thing in C++ once I get all the parts figured out. I have done a bit of script method efficiency testing so I'll start a new thread for that.
dansmith65 Posted December 6, 2013 Posted December 6, 2013 Honza from 24U is big on FileMaker optimization, you can read his blog or buy his product. I rarely use global variables and I almost always pass data as parameters/results to child/parent scripts. In spite of that however; I did think it best to use global variables in my JSON module, which sounds like a similar setup Steve mentioned he was using. I don't know Steve's specific situation, but perhaps it's an edge case where using global variables is easier/clearer/faster? Regarding saving data to a field, the Set Field script step is generally preferable to any of the "Insert *" steps, unless you specifically want to interact with a layout object. Regarding performance of finds, make sure the field you are performing a find on can be indexed. So, performing finds on fields containing an unstored calculation will typically be slow. When performing a scripted find, you may be able to get slightly faster performance by constructing the find request in the Perform Find step by using variable names in the find criteria. This method is not as clear in the code as using enter find mode, set field, perform find, but it may be faster. Also, beware of empty variable's as an empty variable will result in a find on the name of the variable. It's perfectly fine, even preferable, to keep your process split out into separate scripts. There are plenty of arguments for this from different sources, but I personally think that the best reason is that human working memory is one of the biggest bottlenecks in our ability to write software. It's dangerous to put more functionality in one script (or calculation, including custom functions) than you can keep track of in your head. That's a great explanation! Like Steve said, I've experienced being overwhelmed by a single long and complex script, but I didn't know how to explain why I felt that way. One more thing... I don't think custom functions are inherently faster than scripts at performing recursion. But by moving your loops to custom functions, you do introduce a limit on the number of iterations that can be performed (either 5,000 or 10,000 depending on how the function is written IIRC).
jbante Posted December 7, 2013 Posted December 7, 2013 One more thing... I don't think custom functions are inherently faster than scripts at performing recursion. But by moving your loops to custom functions, you do introduce a limit on the number of iterations that can be performed (either 5,000 or 10,000 depending on how the function is written IIRC). 10,000 or 50,000, depending on whether the custom function is tail recursive or not.
Recommended Posts
This topic is 4004 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now