{"@type": "dcat:Dataset", "accessLevel": "public", "accrualPeriodicity": "irregular", "bureauCode": ["026:00"], "contactPoint": {"@type": "vcard:Contact", "fn": "John Stutz", "hasEmail": "mailto:john.c.stutz@nasa.gov"}, "description": "Matlab has a reputation for running slowly.  Here are some pointers on how to speed computations, to an often unexpected degree.  Subjects currently covered:\r\n\r\n \r\n\r\nMatrix Coding\r\n\r\nImplicit Multithreading on a Multicore Machine\r\n\r\nSparse Matrices\r\n\r\nSub-Block Computation to Avoid Memory Overflow\r\n\r\n --------------------------------------------------------------------------------------------------------\r\n\r\n \r\n\r\nMatrix Coding - 1\r\n\r\n \r\n\r\n Matlab documentation notes that efficient computation depends on using the matrix facilities, and that mathematically identical algorithms can have very different runtimes, but they are a bit coy about just what these differences are.  A simple but telling example:\r\n\r\n \r\n\r\n The following is the core of the GD-CLS algorithm of Berry et.al., copied from fig. 1 of Shahnaz et.al, 2006, \"Document clustering using nonnegative matrix factorization':\r\n\r\n         for jj = 1:maxiter\r\n             A = W'*W + lambda*eye(k);\r\n             for ii = 1:n\r\n                 b = W'*V(:,ii);\r\n                 H(:,ii) = A \\ b;\r\n             end\r\n             H = H .* (H>0);\r\n             W = W .* (V*H') ./ (W*(H*H') + 1e-9);\r\n         end\r\n\r\n \r\n\r\nReplacing the columwise update of H with a matrix update gives:\r\n\r\n          for jj = 1:maxiter\r\n              A = W'*W + lambda*eye(k);\r\n              B = W'*V;\r\n              H = A \\ B;\r\n              H = H .* (H>0);\r\n              W = W .* (V*H') ./ (W*(H*H') + 1e-9);\r\n          end\r\n\r\n \r\n\r\nThese were tested on an 8049 x 8660 sparse matrix bag of words V (.0083 non-zeros), with W of size 8049 x 50, H 50 x 8660, maxiter = 50, lambda = 0.1, and identical initial W.   They were run consecutivly, multithreaded on an 8-processor Sun server, starting at ~7:30PM.  Tic-toc timing was recorded. \r\n\r\n \r\n\r\nRuntimes were respectivly  6586.2 and 70.5 seconds, a 93:1 difference.  The maximum absolute pairwise difference between W matrix values was 6.6e-14.  \r\n\r\n \r\n\r\nSimilar speedups have been consistantly observed in other cases.  In one algorithm, combining matrix operations with efficient use of the sparse matrix facilities gave a 3600:1 speedup. \r\n\r\n \r\n\r\nFor speed alone, C-style iterative programming should be avoided wherever possible.  In addition,  when a couple lines of matrix code can substitute for an entire C-style function, program clarity is much improved. \r\n\r\n----------------------------------------------------------------------------------------------------------------------\r\n\r\n Matrix Coding - 2\r\n\r\n \r\n\r\nApplied to integration, the speed gains are not so great,  largely due to the time taken to set up the and deal with the boundaries.  The anyomous function setup time is neglegable.  I demonstrate on a simple uniform step linearly interpolated 1-D integration of cos() from 0 to pi, which should yield zero:\r\n          tic;\r\n          step = .00001;\r\n          fun = @cos;\r\n          start = 0;\r\n          endit = pi;\r\n          enda = floor((endit - start)/step)*step + start;\r\n          delta = (endit - enda)/step;\r\n          intF = fun(start)/2;\r\n          intF = intF + fun(endit)*delta/2;\r\n          intF = intF + fun(enda)*(delta+1)/2;\r\n          for ii = start+step:step:enda-step\r\n             intF = intF + fun(ii);\r\n          end\r\n          intF = intF*step\r\n          toc;\r\n\r\nintF =   -2.910164109692914e-14\r\nElapsed time is 4.091038 seconds.\r\n\r\n \r\n\r\nReplacing the inner summation loop with the matrix equivalent speeds things up a bit:\r\n\r\n          tic;\r\n          step = .00001;\r\n          fun = @cos;\r\n          start = 0;\r\n          endit = pi;\r\n          enda = floor((endit - start)/step)*step + start;\r\n          delta = (endit - enda)/step;\r\n          intF = fun(start)/2;\r\n          intF = intF + fun(endit)*delta/2;\r\n          intF = intF + fun(enda)*(delta+1)/2;\r\n          intF = intF + sum(fun(start+step:step:enda-step));\r\n          intF = intF*step\r\n          toc;\r\nintF =   -2.868419946011613e-14\r\nElapsed time is 0.141564 seconds.\r\n\r\n \r\n\r\nThe core computation take", "identifier": "DASHLINK_9", "issued": "2010-09-09", "keyword": ["ames", "dashlink", "nasa"], "landingPage": "https://c3.nasa.gov/dashlink/resources/9/", "modified": "2025-07-17", "programCode": ["026:029"], "publisher": {"@type": "org:Organization", "name": "Dashlink"}, "title": "Efficient Matlab Programs"}