TAGS :Viewed: 2 - Published at: a few seconds ago

[ Grouping words and more ]

I'm working on a project to learn Clojure in practice. I'm doing well, but sometimes I get stuck. This time I need to transform sequence of the form:

[":keyword0" "word0" "word1" ":keyword1" "word2" "word3"]


[[:keyword0 "word0" "word1"] [:keyword1 "word2" "word3"]]

I'm trying for at least two hours, but I know not so many Clojure functions to compose something useful to solve the problem in functional manner.

I think that this transformation should include some partition, here is my attempt:

(partition-by (fn [x] (.startsWith x ":")) *1)

But the result looks like this:

((":keyword0") ("word1" "word2") (":keyword1") ("word3" "word4"))

Now I should group it again... I doubt that I'm doing right things here... Also, I need to convert strings (only those that begin with :) into keywords. I think this combination should work:

(keyword (subs ":keyword0" 1))

How to write a function which performs the transformation in most idiomatic way?

Answer 1

what about that:

(defn group-that [ arg ]
  (if (not-empty arg)
    (loop [list arg, acc [], result []]
      (if (not-empty list)
        (if (.startsWith (first list) ":")
          (if (not-empty acc)
            (recur (rest list) (vector (first list)) (conj result acc))
            (recur (rest list) (vector (first list)) result))
          (recur (rest list) (conj acc (first list)) result))
        (conj result acc)

Just 1x iteration over the Seq and without any need of macros.

Answer 2

Here is a high performance version, using reduce

(reduce (fn [acc next]
          (if (.startsWith next ":")
            (conj acc [(-> next (subs 1) keyword)])
            (conj (pop acc) (conj (peek acc)
        [] data)

Alternatively, you could extend your code like this

(->> data
     (partition-by #(.startsWith % ":"))
     (partition 2)
     (map (fn [[[kw-str] strs]]
            (cons (-> kw-str
                      (subs 1)

Answer 3

Since the question is already here... This is my best effort:

(def data [":keyword0" "word0" "word1" ":keyword1" "word2" "word3"])

(->> data
     (partition-by (fn [x] (.startsWith x ":")))
     (partition 2)
     (map (fn [[[k] w]] (apply conj [(keyword (subs k 1))] w))))

I'm still looking for a better solution or criticism of this one.

Answer 4

First, let's construct a function that breaks vector v into sub-vectors, the breaks occurring everywhere property pred holds.

(defn breakv-by [pred v]
  (let [break-points (filter identity (map-indexed (fn [n x] (when (pred x) n)) v))
        starts (cons 0 break-points)
        finishes (concat break-points [(count v)])]
    (mapv (partial subvec v) starts finishes)))

For our case, given

(def data [":keyword0" "word0" "word1" ":keyword1" "word2" "word3"])


(breakv-by #(= (first %) \:) data)


[[] [":keyword0" "word0" "word1"] [":keyword1" "word2" "word3"]]

Notice that the initial sub-vector is different:

  • It has no element for which the predicate holds.
  • It can be of length zero.

All the others

  • start with their only element for which the predicate holds and
  • are at least of length 1.

So breakv-by behaves properly with data that

  • doesn't start with a breaking element or
  • has a succession of breaking elements.

For the purposes of the question, we need to muck about with what breakv-by produces somewhat:

(let [pieces (breakv-by #(= (first %) \:) data)]
    #(update-in % [0] (fn [s] (keyword (subs s 1))))
    (rest pieces)))
;[[:keyword0 "word0" "word1"] [:keyword1 "word2" "word3"]]