Archive

Archive for the ‘Containers’ Category

Replacing substrings in C++, this time using maps, for multiple replacements

October 18, 2013 No comments

Some days ago, we talked about how to replace substrings inside a string in C++. We finally got a method to just copy and paste into our projects, but when we want to replace multiple substrings we will get some ugly code, and some times it won’t fit.

We will use, one common container in C++ called map, it’s just a collection of associations between two values, we can see it as an array of key and value elements. So we will associate some substrings with another substring (we will associate fromStrs with toStrs. We also will make a replace() function accepting two initial arguments (the big string, and the map), we will look for each one of the keys and replace them like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <iostream>
#include <string>
#include <map>

using namespace std;

string replace(string source, std::map<string,string>strMap, int offset=0, int times=0)
{
  int total = 0;
  string::size_type pos;

  for (std::map<string, string>::iterator i=strMap.begin(); i!=strMap.end(); ++i)
    {
      string fromStr = i->first;
      string toStr = i->second;
      pos=offset;
      while ( (pos = source.find(fromStr, pos)) < string::npos)
    {
      if ( (times!=0) && (total++>=times) )
        return source;  // Don't work anymore

      source.replace(pos, fromStr.length(), toStr);
      pos+=toStr.size();
    }
    }
  return source;
}

int main()
{
  string original = "I usually write silly things when testing my programs.";

  map<string,string> mapa;
  mapa["usually"] = "always";
  mapa["silly things"] = "lorem ipsum";

  cout << "Original string: "<<original<<endl;

  cout << "Resulting string: "<<replace2(original, mapa)<<endl;

  return 0;
}

In this case, we can add as much elements as we want to the map, and all of them will be searched in the big string. This function is good when we don’t know the fromStr and toStr in compilation time (we can generate them in runtime), we want to fill the map little by little and then do all replacements at once.

But we can have a little problem, when some toStr are contained inside some fromStr and vice versa, this function won’t work as expected. We will have to iterate the map for each one of the substitutions, instead of doing it globally and make single replacements (like we did with the old replace):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include <iostream>
#include <string>
#include <map>

using namespace std;

string replace2(string source, std::map<string,string>strMap, int offset=0, int times=0)
{
  int total = 0;
  string::size_type pos=offset;
  string::size_type newPos;
  string::size_type lowerPos;

  do
    {
      string rep;
      for (std::map<string, string>::iterator i=strMap.begin(); i!=strMap.end(); ++i)
    {
      string fromStr = i->first;

      newPos = source.find(fromStr, pos);
      if ( (i==strMap.begin()) || (newPos<lowerPos) )
        {
          rep = fromStr;
          lowerPos = newPos;
        }
    }

      pos = lowerPos;
      if (pos == string::npos)
    break;

      string toStr = strMap[rep];

      source.replace(pos, rep.length(), toStr);
      pos+=toStr.size();

    } while ( (times==0) || (++total<times) );

  return source;
}

int main()
{

  string original = "If a black bug bleeds black blood, what color blood does a blue bug bleed?";
  map<string,string> mapa;
  mapa["black"] = "blue";
  mapa["blue"] = "black";

  cout << "Original string: "<<original<<endl;

  cout << "Resulting string: "<<replace2(original, mapa)<<endl;

  return 0;
}

The expected result is:

Original string: If a black bug bleeds black blood, what color blood does a blue bug bleed?
Resulting string: If a blue bug bleeds blue blood, what color blood does a black bug bleed?

A bit tongue twisting but I think you’ve got the idea.

One more interesting thing is the map creation. Lot’s of times we will have a clear idea of what elements go in the map. So we don’t want to spend time adding elements one by one. As I told you some days ago, you can pass a variable number of arguments to a C function, so, why not do it here? Let’s pass the strings as char* and add them to the map:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
map <string, string> strMap(const char* first, const char* second, ...)
{
  va_list args;
  map<string, string> ret;;
  int n=0;
  char *value;
  string _first, _second;

  ret.insert(pair<string, string>(first, second));
  va_start(args, second);

  do
    {
      value = va_arg(args, char*);
      if (value==NULL)
        break;

      if (++n % 2 ==0)
    {
      _second = string(value);
      ret.insert(pair<string, string>(_first, _second));
    }
      else
    _first = string(value);

    } while (1);

  return ret;
}

Now we can create the map by doing:

1
map <string, string> mapa = strMap("black", "blue", "blue", "black", NULL);

Don’t forget the last NULL, because it can cause a disaster in runtime (not always, you may have luck, but so often), due to strMap condition to stop reading arguments, it stops when it sees a NULL there, if you don’t put it, maybe it is there yet, or maybe not.

I hope this code is useful for you.

Top