adb: (Default)
[personal profile] adb
Pursuant to the previous entry (deliberately omitting some keywords), today I wrote a thing to find the nth permutation of a list in lexicographic order. I started by thinking about how to enumerate all the permutations. In Knuth's fascicle on the topic (7.2.1.2), there's a multiparagraph description of a procedural algorithm, and it feels really clunky, as procedural algorithms often do. Trying to squeeze this into Haskell (or indeed write it at all) hurt my brain until I took a moment to experiment with a naive recursive approach, which was correct immediately and didn't hurt at all.

First, some utility functions. Given a list, the "breaks" function does a break at each possible place, and the "plucks" function uses that to shove each possible element to the front of the list in order.
import List
breaks list = zip (inits list) (tails list)
plucks      = map (\(a,(b:bs)) -> (b:a++bs)) . filter (not.null.snd) . breaks

Once we have plucks, the recursive permutation generator is (almost) a one-liner:
lexPerm [] = [[]]
lexPerm list = concatMap (\(a:as) -> map (a:) $ lexPerm as) $ plucks list

nthLexPermNaive list n = (lexPerm list) !! n

(With "nth" counting from 0, of course.) For purposes of the problem at hand, this is plenty fast enough: I'm supposed to solve the problem in a minute, and I can get the millionth permutation in 10 seconds on my Mac. But 10 seconds is on the order of 10 billion operations with today's computers, which makes that pretty cheesy in the absolute sense, so I took a walk to grab some coffee and thought about it.

Of course we don't have to create the Library of Babel; we just want to index it. (Literally) exponentially better is, rather than generating all the permutations before the one we want, counting our way down to it:

facs              = scanl (*) 1 [1..]
nthLexPerm []   _ = []
nthLexPerm list n = (a : nthLexPerm as r) where                       
                       (q, r) = divMod n $ facs !! (length list - 1)
                       (a:as) = (plucks list) !! (fromIntegral q)

That is, we divide the permutations into buckets according to which list element goes first, count the size of the buckets with the factorial function, and figure out which bucket n is in; that tells us which list element is first (or next), and then we do it again with the rest of the list.

Here any reasonable-sized problem is done in a quarter of a second, and we can find the 177677923233571573268333138058341516699591046965363937405307966296381962023014186937941054363404599361752734941744506081291078533194009574416912957771832319737113761163613658869166119647099470158639932735389449631379529700548279204436061317363570804677816166240715615358077568957603030467568787485467283541408185096247050754846797356538240335105350184768319014921445795992771675038158542539720338558407349666880205297854949817875248581367158284791557981210613523726279305268854581861693708508755338380215840189011796350539753130229912815964814548277581186845850656885978762975155855474107799572375563645192398932479958005454837178864751059488666828167660475172792057451944248723909023070740690201704204218461563773065204842995022997001914717930110759352130009948356051182939291924622257070571219273403551153577186136834613745179747629411011898365507016397797867759946664396164352924586137424834445681000666697903306350922440956880951100822757699699465424770128867403734302955530773869642512499266246598049631536548394342928273951383947780838481331940343388486681310072330477634611447387268607357721000547836094024490472912547632604894400054262601148815209200611536824928251588246145576128796409004951032069138556006692728392557226610361497954045076321750967816141436901590269316896012779029961283289522264935778047226677551095885867628413873067487023134861122217469111708826648951837236648092031839291238953138538428903941298514916791414225901051629295349745208007851138576134281915440799039033355461955104921381340991048138391507141248899746190517335658061334946717710757497378214993292066589850913539310554144852367633304374349264637998925959048659450078909286667076496842368500758390291714346128962398932314723955587284880575413298280401511548961985187241644378252446046724890997960574325176236030134670928124244580787746244551203733440574462244367494084251204389007927127057222431093486818363231986782289599437634439821245791894971877394793437679047729838280097213664847275870378237936154598464096330797537260335212283067761495779501347588011587551451597326808458070615049984420408904723887522837614716173215691241120478506521674333009339601408207508521776650162676676295023496453038323193688322386683906501607884938848504056056122184887760796381727832479806801020006496228685146859635961757701415267608578011945810350755930572903179388703392038449379171278993700177260967524515557464916828891960707465435279682886539869740427639901298728803067080498751169532520002503500019740660107971999591938918905268705183191538820663699852370418356000032058644870979804667884114398299139416539829745624076354333334983912175144336698099557128491093291895474994348379714069190205016580963516757944876603657884155429310020024261160341686848907305323287103034949923069836744310897816391167069360635979144938330248793458305692974233820801160887623917525327260693056722736806805660369772849672778959546282332406742839875088406th permutation of the first paragraph of Moby-Dick in two seconds.

(If we wanted to permute the whole text of Moby-Dick, this might run into trouble: in the process of figuring out the first letter, we have to spend at least quadratic time computing the factorial of the length of the remaining part of the list. For the earlier permutations, we could trim this down a bit by doing the factorial mod n.)
From:
Anonymous( )Anonymous This account has disabled anonymous posting.
OpenID( )OpenID You can comment on this post while signed in with an account from many other sites, once you have confirmed your email address. Sign in using OpenID.
User
Account name:
Password:
If you don't have an account you can create one now.
Subject:
HTML doesn't work in the subject.

Message:

 
Notice: This account is set to log the IP addresses of people who comment anonymously.
Links will be displayed as unclickable URLs to help prevent spam.

Profile

adb: (Default)
Aaron D. Ball

December 2010

S M T W T F S
   1234
567891011
12131415161718
192021 22232425
262728293031 

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 25th, 2017 04:31 am
Powered by Dreamwidth Studios