I wanted to remove an element named ‘textForm’ from all xmls in a directory. There were around 3000 xml files. I was planning to do it using Groovy, but later, I decided to tryout unix shell scripts to perform this operation.

Here’s the script I used :

 

#!/bin/bash
 
FILES=$HOME/xmls/*
 
for f in $FILES
do
  echo "Processing $f"
  DATA=$(cat $f | sed '/textForm/d')
  rm -f $f
  echo $DATA > $f
done

This is a very simple tip on how to make your life easier using unix. But one issue I encountered was that whenever I gave same file as input and output to sed, data loss occurred. That’s the main reason for me to split the process into three steps: Give the input to sed &store its output into a variable, delete the original file, and use echo the contents of output variable into file with same name. If you know anyother methods to perform this easily, please share with me.

Last month, I was interviewed by InfinitelyBeta recruitment team. Interview was really cool, and I really enjoyed every part of it. One of the tasks given to me was to implement a simple function called reduce with the following characteristics:

  • first parameter of reduce should be name of a function, defined earlier.
  • second parameter should be a list of arguments, which is to be passed to the first parameter.
  • should return the result of executing first parameter

for eg:

 

1
2
3
4
5
def add(a, b):
    return a+b
'''
reduce function should called as : reduce(add, [1,2]) and should return 3
'''

Let’s see how to write a function, which will call another function using the parameters specified.

In python, we can write a function as follows:
Container Function

1
2
3
4
5
6
7
# container function will call the function func with the parameter args
def container(func, args):
    func(args)
 
# generic function to print
def printer(a):
    print a

If we call

1
container(printer, "hello")

it will print hello. If we try to pass the second parameter as a list, for eg:

1
container(printer, [1, 2, "hello"])

,it will just print the same list for you. (Remember that print function will just print the complete list when a list is passed to it as argument).

Now, let’s try something more deeper into this container function. Redefining the container function as follows :

1
2
def container(func, *args): #line 1
    return func(*args) #line 2
  • RULE 1 : asterisk before args in the first line means to “take the rest of the parameters given and put them in a list called args.”
  • RULE 2 : asterisk before args in the second line mean “take this list called args and unwrap it into the rest of the parameters.”

Now, let’s see what happens:

Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def container(fun, *args):
...     return fun(*args)
... 
>>> def add(a, b):
...     return a+b
... 
>>> container(add, 1,2)
3

Here, we got 3 as output, which is correct. How?

When we called the container function with add, 1, 2 as arguments, 1, 2 where added to the tuple args (see RULE 1). Thus, in container function, func => add and args = [1, 2]. Then, according to RULE 2, the tuple args was expanded into parameters for add and add(1, 2) was called. We could easily find the values in args as:

>>> def container(func, *args):
...     print args
...
>>> container(add, 1,2)
(1, 2)
>>>

Now see what happens when we try to pass a list as the argument to second parameter:

>>> container(add, [1, 2])
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 2, in container
TypeError: add() takes exactly 2 arguments (1 given)
>>>

This time, the entire list [1, 2] was taken as a single argument. We may solve this issue by making a minor change to our container function as follows:

>>> def container(fun, args):
...     return fun(*args)
... 
>>> container(add, [1,2])
3
>>>

Here, we’ve specified that container will take exactly 2 arguments, and the second argument args should be unwraped into a list of parameters.

So, what if we need to call the container function as follows?

container(function-name, [1,2], [4, 5])

Here we have 3 parameters, in which second and third are lists. We shall define the container function as:

>>> def container(fun, *args):
...     return fun(*args)
>>>
>>>
>>> container(add, [1,2], [4,5])
[1, 2, 4, 5]

Now, we have enough background to implement the reduce function. I was asked to do it for the add function, so, I did it as follows:

def reduce(fun, args):
    k = args[0]
    if len(args) == 2:
        k = fun(arg[0], arg[1])
        return k  
    return k + reduce(fun, args[1:])

Since this was only for add, I was asked to make it generic for any function. After some corrections to it, I changed it as follows:

Reduce function

1
2
3
4
5
6
def reduce(fun, args):
    k = args[0]
    if len(args) == 2:
        k = fun(arg[0], arg[1])
        return k  
    return fun(k, reduce(fun, args[1:]))


Tail recursive implementation

1
2
3
4
5
def reduce(fun, args):
    if len(args)==1:
        return args[0]
    args.insert(args[2:], fun(args[0],args[1])
    return reduce(fun, args )

Bottom line:
Infinitely Beta recruited me, and received the offer letter 2 days after my interview :)