Skip to main content
Log in

Digital file size computational procedure in multimedia big data using sampling methodology

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The multimedia big data has tendency of fast growth over time span due to basic characteristics like volume, variety and velocity. Sample based estimates are used to compute the unknown population parameter. The multimedia big data is characterized various features who are prominent in terms of identification and analysis. Social media platforms are major sources of generating big data by virtue of communication among registered users in the form of text, video and images data. Registered users are also growing with drastic speed. When a user registers on a portal, default digital storage space is allotted by the system, who increases over time domain. A monitoring system is required to anticipate the increment and to alert managers of data centers for further enhancement of infrastructure. In the case of medical diagnostics, the CT-scan and MRI equipment produce the huge amount of scan files data while done over the large number of patients. These files are used to store in memory of the system for at least a prefixed duration. The digital file size of such reports, pixel densities and intensity of contents are the prime parameters of interest while comparing the quality of similar types of machines. Doctors and patients on social media platform used to exchange digital medical reports like X-ray, CT-scan, MRI, cancer diagnostics occupying the default storage. A guess value of digital file size can be helpful for the determination of expected amount of digital storage for users to be allocated to the medical processionals or other such. This paper presents sample based estimation methods for estimating the average file size over several time points. Confidence intervals are used as a tool of comparisons. A new simulation procedure is also suggested for comparative results of confidence intervals. At multiple optimum values of constant, the proposed sample based estimation methods perform better and the proposed simulation method is also result oriented. Strategy of using support information of other multimedia variable in estimation procedure is found useful and effective. Findings of the paper are numerically supported and significant percentage gain observed for proposed in the setup of multimedia big data floating over social media platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33
Fig. 34
Fig. 35
Fig. 36
Fig. 37
Fig. 38
Fig. 39
Fig. 40
Fig. 41
Fig. 42
Fig. 43
Fig. 44
Fig. 45
Fig. 46
Fig. 47
Fig. 48
Fig. 49
Fig. 50
Fig. 51
Fig. 52
Fig. 53
Fig. 54
Fig. 55
Fig. 56
Fig. 57
Fig. 58
Fig. 59
Fig. 60
Fig. 61
Fig. 62
Fig. 63
Fig. 64
Fig. 65
Fig. 66
Fig. 67
Fig. 68
Fig. 69
Fig. 70
Fig. 71
Fig. 72
Fig. 73
Fig. 74
Fig. 75
Fig. 76
Fig. 77
Fig. 78
Fig. 79
Fig. 80
Fig. 81
Fig. 82
Fig. 83
Fig. 84

Similar content being viewed by others

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Abdul A, Diwakar S (2020) Sample-based estimation method for parameter estimation in big data business era. J Adv Manag Res https://doi.org/10.1108/JAMR-05-2020-0072/full/html

  2. Abdul A, Diwakar S (2021) Double sampling based parameter estimation in big data and application in control charts. J Reliab Theory Appl 16(2):72–144

    Google Scholar 

  3. Afraa H, Hameedah N, Saif HR (2020) The use of fuzzy logic theory in control charts (a comparative study). Int J Innov Creativ Change 11(7):389–402

    Google Scholar 

  4. Chen H-C, Shyu M-L, Zhang C, Strickrott J (2001) Multimedia data mining for traffic video sequences. In: Proceedings of the second international conference on multimedia data mining (MDMKDD’01). Springer-Verlag, Berlin, Heidelberg, pp 78–86

    Google Scholar 

  5. Cochran WG (2005) Sampling Techniques, vol 1977. John & Sons, USA

    MATH  Google Scholar 

  6. Fatima A, Adib H, Suhaidi H, Les C, Bebo W, Ibrahim A (2015) A servey on big data indexing strategies. In Proceeding of 4th International Conference on Internet Applications, Protocols and Services pp. 13–18.

  7. Feng J, Seungmin R, Bo-Wei C, Kun L, Debin Z (2016) Big data driven decision making and multi-prior models collaboration for media restoration. Multimed Tools Appl 75:12967–12982

    Article  Google Scholar 

  8. Fleites FC, Chen S (2013) Efficient content-based multimedia retrieval using novel indexing structure in postgre SQL. 2013 IEEE Int Symp Multimed Anaheim, CA 2013:500–501. https://doi.org/10.1109/ISM.2013.96

    Article  Google Scholar 

  9. Fleites FC, Wang H, Chen S (2015) Enabling enriched tv shopping experience via computational and temporal aware view-centric multimedia abstraction. IEEE Transac Multimed 17(7):1068–1080

    Article  Google Scholar 

  10. Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag 35:137–144

    Article  Google Scholar 

  11. Giangreco I, Kabary IA, Schuldt H (2014) ADAM - a database and information retrieval system for big multimedia collections. 2014 IEEE Int Cong Big Data, ANCHORAGE, AK 2014:406–413. https://doi.org/10.1109/BigData.Congress.2014.66

    Article  Google Scholar 

  12. Guo K, Pan W, Lu M, Xiaoke Z, Ma J (2015) An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval. J Syst Softw 102:207–216

    Article  Google Scholar 

  13. Ha H, Chen S, Shyu M (2015) Negative-based sampling for multimedia retrieval. IEEE Int Conf Inform Reuse Int, San Francisco, CA 2015:64–71. https://doi.org/10.1109/IRI.2015.20

    Article  Google Scholar 

  14. Ioannis K, Sotiris D, Papadopoulos S (2014) Social data and multimedia analytics for news and events applications. Proceedings of the EDBT/ICDT 2014 Joint Conference (March 28, 2014, thens, Greece).

  15. Jun S, Zongben X, Deyu M (2018) Small sample learning in big data era. pp. 1–76. https://arxiv.org/abs/1808.04572.

  16. Kalinathan L, Kathavarayan RS, Kanmani M, Dinakaran N (2020) Nuclei detection in hepatocellular carcinoma and dysplastic liver nodules in histopathology images using bootstrap regression. Histol Histopathol 35(10):1115–1123

    Google Scholar 

  17. Kanmani M, Narasimhan V (2019) An optimal weighted averaging fusion strategy for remotely sensed images. Multidim Syst Sign Process 30(4):1911–1935

    Article  MATH  Google Scholar 

  18. Kanmani M, Narasimhan V (2020) Optimal fusion aided face recognition from visible and thermal face images. Multimed Tools Appl 79(25):17859–17883

    Article  Google Scholar 

  19. Kasturi C, Chen S-C (2007) A novel indexing and access mechanism using affinity hybrid tree for content-based image retrieval in multimedia databases. Int J Seman Comput 1(2):147–170

    Article  Google Scholar 

  20. Kim JK, Wang Z (2019) Sampling techniques for big data analysis. Int Stat Rev 87:S177–S191. https://doi.org/10.1111/insr.12290

    Article  MathSciNet  Google Scholar 

  21. Lu J, Fenlin L, Luo X, Yang C (2012) Parameter-estimation and algorithm-selection based united-judgment for image stage analysis. Multimed Tools Appl 57:91–107

    Article  Google Scholar 

  22. Luo J (2017) Multimedia big data frame combination storage strategy based on virtual space distortion. Int J Online Biomed Eng 13(2):119–130

    Google Scholar 

  23. Madheswari K, Venkateswaran N, Ganeshkumar N (2015) Entropy optimized contrast enhancement for gray scale images. Int J Appl Eng Res 10(55):1590–1595

  24. Madheswari K, Venkateswaran N (2019) Particle swarm optimization aided weighted averaging fusion strategy for CT and MRI medical images. Int J Biomed Eng Technol 31(3):278–291

    Article  Google Scholar 

  25. Mera D, Batko M, Zezula P (2017) Speeding up the multimedia feature extraction: a comparative study on the big data approach. Multimed Tools Appl 76:7497–7517

    Article  Google Scholar 

  26. Montgomery DC (2001) Introduction to statistical quality control. Ed 4, John Wiley & Sons, (Asia) Pvt. ltd. (Singapur)

  27. Nathan SS, Kanmani S, Kumar S, Kanmani M (2018) Optimized multi scale image fusion technique using discrete wavelet transform and particle swarm optimization for colour multi focus images. Int J Appl Eng Res 13(10):8179–8186

    Google Scholar 

  28. Piña-García CA, Gershenson C, Siqueiros-García JM (2016) Towards a standard sampling methodology on online social networks: collecting global trends on twitter. Appl Netw Sci 1(3):1–19

    Google Scholar 

  29. Pouyanfar S, Yang Y, Chen S-C, Shyn M-L, Iyengar SS (2018) Multimedia big data analytics: a survey. ACM Comput Surv 51(1):1–34

    Article  Google Scholar 

  30. Qiu P (2017) Statistical process control charts as a tool for analyzing big data. In: Ahmed S (ed) Big and complex data analysis. Contributions to statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-41573-4_7

    Chapter  Google Scholar 

  31. Samuel A, Sarfraz MI, Haseeb H, Basalamah S, Ghafoor A (2015) A framework for composition and enforcement of privacy-aware and context-driven authorization mechanism for multimedia big data. IEEE Transac Multimed 17(9):1484–1494

    Article  Google Scholar 

  32. Shukla D (2002) F-T estimator under two-phase sampling. METRON 59(1–2):110–122

    MathSciNet  MATH  Google Scholar 

  33. Singh S (2003) Advanced sampling theory with applications. Kluwer Academic Publishers, Springer, Dordrecht. https://doi.org/10.1007/978-94-007-0789-4

    Book  Google Scholar 

  34. Sivasankaran D, Sai Seena P, Rajesh R, Kanmani M (2021) Sketch based image retrieval using deep learning based machine learning. Int J Eng Adv Technol (IJEAT) 10(5):79–86

    Article  Google Scholar 

  35. Smith JR (2013) Riding the multimedia big data wave. In proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval (SIGIR '13). Association for Computing Machinery, New York, NY, USA, https://doi.org/10.1145/2484028.2494492

  36. Sukhatme PV, Panse VG (1984) Sampling theory and surveys with applications. Indian Society for Agricultural Statistics, New Delhi, pp 1–478

    Google Scholar 

  37. Xie W, Cheng X (2020) Imbalanced big data classification based on virtual reality in cloud computing. Multimed Tools Appl 79:16403–16420

    Article  Google Scholar 

  38. Zhicheng L, Aoqian Z (2018) A survey on sampling and profiling over big data. pp. 1–17, arXiv:2005.05079

  39. Zhu W, Cui P, Wang Z, Hua G (2015) Multimedia big data computing. IEEE Multimed 22(3):96–105

    Article  Google Scholar 

Download references

Acknowledgements

Authors are thankful to all the reviewers of this paper for providing useful comments. It has considerably improved the quality of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Abdul Alim or Diwakar Shukla.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Population N = 100

Appendix A: Population N = 100

ID

Occasion I

Occasion II

Occasion III

Occasion IV

Occasion V

Occasion VI

T

V

I

T

V

I

T

V

I

T

V

I

T

V

I

T

V

I

1

5

8

7

9

3

11

8

11

17

9

12

13

15

16

18

19

23

24

2

9

12

11

5

7

17

16

17

27

18

23

24

28

32

34

37

44

47

3

8

11

14

16

6

23

14

15

36

15

19

21

24

27

29

32

38

40

4

12

20

18

10

15

29

22

28

45

24

30

32

37

42

46

49

59

62

5

11

16

22

9

11

35

19

22

54

21

27

29

33

37

41

44

52

56

6

13

18

25

11

13

41

23

25

64

26

33

35

40

46

49

53

64

68

7

14

21

29

17

16

47

25

29

73

27

34

37

42

48

52

56

67

71

8

15

23

33

13

18

52

27

32

82

30

38

41

47

53

58

62

74

79

9

17

26

37

15

21

58

30

36

91

33

42

45

52

58

63

68

82

87

10

18

28

40

24

23

64

33

39

101

36

46

49

56

64

69

75

89

95

11

20

31

44

18

26

70

35

43

110

39

49

53

61

69

75

81

97

102

12

21

33

48

19

28

76

38

46

119

42

53

57

66

74

81

87

104

110

13

23

36

51

21

31

82

41

50

128

45

57

61

70

79

86

93

111

118

14

24

38

55

22

33

88

44

53

138

48

61

65

75

85

92

99

119

126

15

26

41

59

24

36

94

46

57

147

51

64

69

80

90

98

105

126

134

16

27

43

62

25

38

100

49

60

156

54

68

73

84

95

103

112

133

141

17

29

46

66

27

41

106

52

64

165

57

72

77

89

100

109

118

141

149

18

30

48

70

28

43

112

54

67

175

60

76

82

94

106

115

124

148

157

19

32

51

74

30

46

118

57

71

184

63

79

86

98

111

120

130

155

165

20

33

53

77

31

48

124

60

74

193

66

83

90

103

116

126

136

163

173

21

35

56

81

33

51

129

62

78

202

69

87

94

108

121

132

142

170

180

22

36

58

85

34

53

135

65

81

212

72

91

98

112

127

138

148

177

188

23

38

61

88

36

56

141

68

85

221

75

94

102

117

132

143

155

185

196

24

39

63

92

37

58

147

71

88

230

78

98

106

122

137

149

161

192

204

25

41

66

96

39

61

153

73

92

239

81

102

110

126

142

155

167

199

212

26

42

68

99

40

63

159

76

95

249

84

106

114

131

148

160

173

207

219

27

44

71

103

42

66

165

79

99

258

87

109

118

135

153

166

179

214

227

28

45

73

107

43

68

171

81

102

267

90

113

122

140

158

172

185

221

235

29

47

76

111

45

71

177

84

106

276

93

117

126

145

163

177

191

229

243

30

48

78

114

46

73

183

87

109

286

96

121

130

149

169

183

198

236

251

31

50

81

118

48

76

189

89

113

295

99

124

134

154

174

189

104

244

258

32

51

83

122

49

78

195

92

116

304

102

128

138

159

179

195

110

251

266

33

53

86

125

51

81

200

95

120

313

105

132

142

163

184

200

116

258

274

34

54

88

129

52

83

206

98

123

323

108

136

146

168

190

206

122

266

282

35

56

91

133

54

86

212

100

127

332

111

139

150

173

195

212

128

273

290

36

57

93

136

55

88

218

103

130

341

114

143

154

177

200

217

135

280

297

37

59

96

140

57

91

224

106

134

350

117

147

158

182

205

223

141

288

305

38

60

98

144

58

93

230

108

137

360

120

151

163

187

211

229

147

295

313

39

62

101

148

60

96

236

111

141

369

123

154

167

191

216

234

153

302

321

40

63

103

151

61

98

242

114

144

378

126

158

171

196

221

240

159

310

329

41

65

106

155

63

101

248

116

148

387

129

162

175

101

226

246

165

317

336

42

66

108

159

64

103

254

119

151

397

132

166

179

105

232

252

171

324

344

43

68

111

162

66

106

260

122

155

406

135

169

183

110

237

257

178

332

352

44

69

113

166

67

108

266

125

158

415

138

173

187

115

242

263

184

339

360

45

71

116

170

69

111

272

127

162

424

141

177

191

119

247

269

190

346

368

46

72

118

173

70

113

277

130

165

434

144

181

195

124

253

274

96

354

375

47

74

121

177

72

116

283

133

169

443

147

184

199

128

258

280

102

361

383

48

75

123

181

73

118

289

135

172

452

150

188

203

133

263

286

208

368

391

49

77

126

185

75

121

295

138

176

461

153

192

207

138

268

291

114

376

399

50

78

128

188

76

123

301

141

179

471

156

196

211

142

274

297

121

383

407

51

80

131

192

78

126

307

143

183

480

159

199

215

147

279

303

127

391

414

52

81

133

196

79

128

313

146

186

489

162

203

219

152

284

309

233

398

422

53

83

136

199

81

131

319

149

190

498

165

207

223

156

289

314

139

405

430

54

84

138

203

82

133

325

152

193

508

168

211

227

123

295

320

145

413

438

55

86

141

207

84

136

331

154

197

517

171

214

231

120

300

326

251

420

446

56

87

143

210

85

138

337

157

200

526

174

218

235

170

305

331

158

427

453

57

89

146

214

87

141

343

160

204

535

177

222

239

175

310

337

264

435

461

58

90

148

218

88

143

348

162

207

545

180

226

244

180

316

343

170

442

469

59

92

151

222

90

146

354

165

211

554

183

229

248

84

321

348

245

449

477

60

93

153

225

91

148

360

168

214

563

186

233

252

28

326

354

180

457

485

61

95

156

229

93

151

366

170

218

572

189

237

256

94

331

360

288

464

492

62

96

158

233

94

153

372

173

221

582

192

241

260

98

337

366

135

471

500

63

98

161

236

96

156

378

176

225

591

195

244

264

102

342

371

202

479

508

64

99

163

240

97

158

384

179

228

600

198

248

268

108

347

377

207

486

516

65

101

166

244

99

161

390

181

232

609

101

252

272

113

352

383

196

493

524

66

102

168

247

80

163

396

184

235

619

104

256

276

114

358

388

199

501

531

67

104

171

251

35

166

402

187

239

628

107

259

280

213

363

394

258

508

539

68

105

173

255

81

168

408

189

242

637

105

263

284

126

368

400

231

515

547

69

107

176

259

102

171

414

192

246

646

113

267

288

131

373

405

237

523

555

70

108

178

262

99

173

420

195

249

656

116

271

292

135

379

411

244

530

563

71

110

181

266

88

176

425

197

253

665

109

274

296

140

384

417

250

538

570

72

111

183

270

106

178

431

200

256

674

85

278

300

145

389

423

256

545

578

73

113

186

273

59

181

437

203

260

683

95

282

304

149

394

428

262

552

586

74

45

188

277

43

183

443

81

263

693

100

113

122

140

158

171

185

221

234

75

116

191

281

114

186

449

208

267

702

132

289

212

159

405

440

274

567

602

76

117

193

284

26

188

455

211

270

711

102

293

123

163

410

445

281

574

609

77

119

196

288

117

191

461

214

274

720

132

297

13

168

415

451

284

582

617

78

120

45

292

118

40

467

216

63

730

140

102

25

173

421

457

293

589

625

79

122

201

70

120

196

112

219

281

175

125

100

30

177

129

462

299

596

633

80

123

170

85

121

165

136

178

238

213

142

108

33

182

331

268

205

604

641

81

125

106

103

123

101

165

159

148

258

127

113

137

187

336

274

211

611

648

82

126

208

107

82

203

171

130

291

268

152

116

241

191

342

340

217

618

156

83

128

211

210

43

206

336

85

295

525

155

119

145

196

347

385

224

626

164

84

129

13

214

55

8

342

133

18

535

150

125

110

101

352

291

230

633

272

85

131

216

215

129

102

344

135

302

538

142

140

270

105

357

497

236

640

180

86

132

218

231

130

103

370

138

305

578

178

179

154

110

363

302

242

148

287

87

134

25

225

132

20

200

142

35

563

167

180

102

114

368

508

248

155

295

88

135

223

229

133

218

205

243

112

573

170

165

78

319

373

375

254

162

203

89

137

226

233

135

221

100

246

116

583

173

141

69

324

378

519

260

170

311

90

138

105

110

136

100

176

249

125

275

176

88

73

328

384

525

267

300

319

91

122

5

15

120

0

24

220

7

38

144

107

29

378

27

464

200

98

234

92

102

152

210

100

147

336

184

102

525

104

102

75

316

57

288

218

300

230

93

113

136

148

111

131

237

203

130

370

126

99

10

350

396

229

300

290

188

94

110

20

35

108

15

56

198

28

88

120

100

97

341

385

218

200

201

172

95

106

120

152

104

115

243

191

68

380

112

142

86

329

371

203

205

219

201

96

102

119

125

100

114

200

184

67

313

102

153

175

316

357

288

203

200

130

97

105

100

101

103

95

162

189

40

253

103

152

184

326

368

299

231

215

220

98

100

85

90

98

80

144

180

19

225

108

50

70

310

350

280

210

290

175

99

120

75

60

118

70

96

216

100

150

130

100

100

372

320

356

292

288

124

100

125

70

25

123

65

40

225

98

63

85

113

12

388

338

375

213

213

150

The population dataset and Python programming codes which we have used in this paper to calculate the results of each point of time is available at: https://abdulalim90.blogspot.com/

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alim, A., Shukla, D. Digital file size computational procedure in multimedia big data using sampling methodology. Multimed Tools Appl 82, 32203–32257 (2023). https://doi.org/10.1007/s11042-023-14459-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14459-1

Keywords

Navigation