TAGS :Viewed: 13 - Published at: a few seconds ago

[ Searching Specific Data From a File ]

I have a File having text and few numbers.I just want to extract numbers from it.How do I go about it ???

I tried using all that split thing but no luck so far. My File is like this:

*AT+CMGL="ALL" +CMGL: 5566,"REC READ","Ufone" Dear customer, your DAY_BUCKET subscription will expire on 02/05/09 +CMGL: 5565,"REC READ","+923466666666"*

KINDLY TELL ME THE WAY TO EXTRACT NUMBERS LIKE +923466666666 from this File so I can put them into another File or textbox.

Thanks

Answer 1


If the numbers are all at the end of the lines then you can use code like the following

foreach ( string line in File.ReadAllLines(@"c:\path\to\file.txt") ) {
  Match result = Regex.Match(line, @"\+(\d+)""$");
  if ( result.Success ) { 
    var number = result.Groups[1].Value;
    // do what you want with the number
  }
}

Answer 2


Here's an example using the String.Split. The "number" contains a '+', so really it should be treated as a string not a number. I'm presuming it's a telephone number with the '+' potentially used for international calls? If it is a telephone number, you need to be careful of dashes, spaces in the number as well as extension numbers added to the end eg "+9234 666-66666 ext 235" and so on...

Anyway - hopefully the example is useful in getting to grips with Split.

The code include unit tests using NUnit v2.4.8

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using NUnit.Framework;
using System.Text.RegularExpressions;

namespace SO.NumberExtractor.Test
{
    public class NumberExtracter
    {
        public List<string> ExtractNumbers(string lines)
        {
            List<string> numbers = new List<string>();
            string[] seperator = { System.Environment.NewLine };
            string[] seperatedLines = lines.Split(seperator, StringSplitOptions.RemoveEmptyEntries);

            foreach (string line in seperatedLines)
            {
                string s = ExtractNumber(line);
                numbers.Add(s);
            }

            return numbers;
        }

        public string ExtractNumber(string line)
        {
            string s = line.Split(',').Last<string>().Trim('"');
            return s;
        }

        public string ExtractNumberWithoutLinq(string line)
        {
            string[] fields = line.Split(',');
            string s = fields[fields.Length - 1];
            s = s.Trim('"');

            return s;
        }
    }

    [TestFixture]
    public class NumberExtracterTest
    {
        private readonly string LINE1 = "AT+CMGL=\"ALL\" +CMGL: 5566,\"REC READ\",\"Ufone\" Dear customer, your DAY_BUCKET subscription will expire on 02/05/09 +CMGL: 5565,\"REC READ\",\"+923466666666\"";
        private readonly string LINE2 = "AT+CMGL=\"ALL\" +CMGL: 5566,\"REC READ\",\"Ufone\" Dear customer, your DAY_BUCKET subscription will expire on 02/05/09 +CMGL: 5565,\"REC READ\",\"+923466666667\"";
        private readonly string LINE3 = "AT+CMGL=\"ALL\" +CMGL: 5566,\"REC READ\",\"Ufone\" Dear customer, your DAY_BUCKET subscription will expire on 02/05/09 +CMGL: 5565,\"REC READ\",\"+923466666668\"";

        [Test]
        public void ExtractOneLineWithoutLinq()
        {            
            string expected = "+923466666666";

            NumberExtracter c = new NumberExtracter();
            string result = c.ExtractNumberWithoutLinq(LINE1);

            Assert.AreEqual(expected, result);            
        }

        [Test]
        public void ExtractOneLineUsingLinq()
        {
            string expected = "+923466666666";

            NumberExtracter c = new NumberExtracter();
            string result = c.ExtractNumber(LINE1);

            Assert.AreEqual(expected, result);
        }

        [Test]
        public void ExtractMultipleLines()
        {
            StringBuilder sb = new StringBuilder();
            sb.AppendLine(LINE1);
            sb.AppendLine(LINE2);
            sb.AppendLine(LINE3);

            NumberExtracter ne = new NumberExtracter();
            List<string> extractedNumbers = ne.ExtractNumbers(sb.ToString());

            string expectedFirst = "+923466666666";
            string expectedSecond = "+923466666667";
            string expectedThird = "+923466666668";

            Assert.AreEqual(expectedFirst, extractedNumbers[0]);
            Assert.AreEqual(expectedSecond, extractedNumbers[1]);
            Assert.AreEqual(expectedThird, extractedNumbers[2]);
        }
    } 
}

Answer 3


How large is the file? If the file is under a few megabytes in size I would recommend loading the file contents into a string and using a compiled regular expression to extract matches.

Here's a quick example:

    Regex NumberExtractor = new Regex("[0-9]{7,16}",RegexOptions.Compiled);

    /// <summary>
    /// Extracts numbers between seven and sixteen digits long from the target file.
    /// Example number to be extracted: +923466666666
    /// </summary>
    /// <param name="TargetFilePath"></param>
    /// <returns>List of the matching numbers</returns>
    private IEnumerable<ulong> ExtractLongNumbersFromFile(string TargetFilePath)
    {

        if (String.IsNullOrEmpty(TargetFilePath))
            throw new ArgumentException("TargetFilePath is null or empty.", "TargetFilePath");

        if (File.Exists(TargetFilePath) == false) 
            throw new Exception("Target file does not exist!");

        FileStream TargetFileStream = null;
        StreamReader TargetFileStreamReader = null; 
        string FileContents = "";
        List<ulong> ReturnList = new List<ulong>();

        try
        {
            TargetFileStream = new FileStream(TargetFilePath, FileMode.Open);
            TargetFileStreamReader = new StreamReader(TargetFileStream);
            FileContents = TargetFileStreamReader.ReadToEnd();

            MatchCollection Matches = NumberExtractor.Matches(FileContents);

            foreach (Match CurrentMatch in Matches) {
                ReturnList.Add(System.Convert.ToUInt64(CurrentMatch.Value));
            }

        }
        catch (Exception ex)
        {
            //Your logging, etc...
        }
        finally
        {
            if (TargetFileStream != null) {
                TargetFileStream.Close();
                TargetFileStream.Dispose();
            }

            if (TargetFileStreamReader != null)
            {
                TargetFileStreamReader.Dispose();
            }
        }

        return (IEnumerable<ulong>)ReturnList;


    }

Sample Usage:

List<ulong> Numbers = (List<ulong>)ExtractLongNumbersFromFile(@"v:\TestExtract.txt");